enzo-project / enzo-e

A version of Enzo designed for exascale and built on charm++.
Other
29 stars 35 forks source link

Conservation of mass and energy #56

Open pgrete opened 3 years ago

pgrete commented 3 years ago

In the process of testing the DD branch I ran the test problem (Hi.in) following the "Getting Started" instructions (using a "standard" linux system with GNU compilers using CELLO_PREC=double).

The output I get includes information like

0 00009.18 Method Field density sum 1.4601588822392469e+03 conserved to 4.74825 digits of 16
 pass  0/4 build/Cello/problem_MethodFluxCorrect.cpp 205  MethodFluxCorrect precision (density)
0 00009.18 Method Field internal_energy sum 0.0000000000000000e+00 conserved to inf digits of 16
0 00009.18 Method Field total_energy sum 4.3561676005276215e+03 conserved to 4.49216 digits of 16
0 00009.18 Method Field velocity_x sum -6.8978075245716434e-03 conserved to -3.34049 digits of 16
0 00009.18 Method Field velocity_y sum 1.0612441232024589e-03 conserved to -13.1133 digits of 16

If I read this correctly then mass and total energy are only conserved to less than 5 digits already after a few timesteps. Is the output incorrect, is the code not conserving mass and energy to what I'd expect for double precision calculations, or am I missing something else?

jobordner commented 3 years ago

Yes, I've encountered this as well, even with non-AMR PPM-only problems with periodic boundary conditions, which should rule out most everything but the hydro solver. Comparing conservation between Enzo-E with ENZO on the same problem might be helpful.

gregbryan commented 3 years ago

It might also be worth trying with VLCT.

mabruzzo commented 3 years ago

Yeah, I'll definitely try it with VLCT once I get solver-dd merged into my branch.

Hopefully, I'll have enough time to finish the merge in the next few days (the code now builds - I just need to pull in the 2 most recent commits from solver-dd and make sure that all of the tests pass)

mabruzzo commented 3 years ago

As a means of procrastination (dealing with the failing test in my PR), I've started to look into this. I've got a mostly working implementation of flux corrections for the hydro version of the vlct integrator.

But I have a few clarifying questions/comments for the existing ppm integrator. I'm not particularly familiar with flux corrections, so definitely point anything out that you think I'm getting incorrect (this is also why I haven't gone ahead to start fixing things myself).

  1. I'm not sure we actually expect that the default Hi.in example will actually have conserved quantities because it has "reflecting" boundaries. Moreover it doesn't have flux corrections enabled by default for the velocities (which I think is required to conserve energy). However, after modifying the boundaries and which quantities we flux correct, we still don't get great conservation.
  2. From what I can tell, there isn't any machinery in place to correctly perform flux corrections on specific quantities (e.g. total energy or velocity components). Am I missing something? (I wrote some code to change this, but changing this alone does not resolve the problem)
  3. I think that there could possibly be bugs in FaceFluxes::coarsen (although it's more likely that I'm misunderstanding something or missing some additional operation). First, I want to clarify that the "fluxes" being corrected are actually dt*flux/cell_width, right? With that mind, I think there may be an issue in the constant factor multiplied by the sum of (2 or 4) fine "fluxes" to get the coarsened "flux".
    • When rank == 2, the sum is currently multiplied by a factor of 0.5*dArea = 0.125 (because dArea=0.25). Why isn't the sum multiplied by 0.25? (I would expect a factor of 0.5 that adjusts dt/cell_width to be multiplied by a factor of 0.5 to account for differences in the "Area" of a refined 2D cell's interface)
    • When rank == 3, the sum is currently multiplied by a factor of 0.25*dArea = 0.03125 (because dArea=0.125). Why isn't the sum multiplied by 0.125? (I would expect a factor of 0.5 that adjusts dt/cell_width to be multiplied by a factor of 0.25 to account for differences in the "Area" of a refined 3D cell's interface)
  4. Finally, I've noticed some worrying results when I use the ppm solver to run a series of identical sound waves along the x-axis that are parallel to each other (I started running these to debug my hydro flux correction implementation for the vlct integrator - I ultimately plan to perform convergence tests with inclined waves like they did in Athena++). I compared the results of a unigrid version with an amr version. I find two (possibly related) weird results (these results didn't really change when I turned off all optimizations and just used linux_gnu.py with flags_arch = '-O0 -g -fPIC -pedantic'):
    1. I would expect that at a given x, all cells should have identical values independent of their (y,z) values. This is exactly what I find in the unigrid case. However, in the amr case that's not what I find. Instead I find standard deviations of 1e-13 through 1e-11. This analysis script reproduces the result. While this may sound insignificant, I'm concerned that this is a symptom of a indexing error.
    2. Additionally, the conservation summary printed after each timestep indicates that the amr case does not perfectly conserve "velocity_y" and "velocity_z" (for reference, the unigrid case does in fact perfectly conserve these quantities). Even if the flux correction is not exactly correct for specific quantities, I would expect these to be conserved since I think the fluxes should be 0. Admittedly, the discrepancies are very small (of order 1e-21). Additionally, I'm unfamiliar with ppm's twoshock Riemann Solver and how exactly steepening/flattening/diffusion work (I guess if they introduce non-zero fluxes that perfectly cancel on a unigrid, that could explain this result for AMR)
mabruzzo commented 3 years ago

@jobordner I realized that I didn't tag you in my previous comment.

If you've already seen it, I don't mean to pressure you to respond quickly. I'm just tagging you to make sure it catches your attention (so that you can provide some insight into these points when you have a chance)

mabruzzo commented 3 years ago

@jobordner @pgrete @forrestglines

Here's the link to the branch that with the commit that should apply flux-corrections to quantities like velocity and specific total energy correctly: https://github.com/mabruzzo/enzo-e/tree/conserved-flux-correct

Apologies if the updated code is a little messy. Additional optimizations could definitely be made, once we know that this works correctly

EDIT: I realized there was an indexing issue, it should be correct now. Maybe I'll try refactoring it to use CelloArray instances (that might make things less complicated).

jobordner commented 3 years ago

As far as I can tell the constant factors in FaceFluxes::coarsen() from 3. above are correct. Density is conserved in my test problems, but if I change it to what one would expect (e.g. 0.25 for 2D) it isn't. I haven't worked it out, but I think the issue may be related to how fluxes are computed in flux_twoshock.F below, the code for which was copied directly from ENZO to Enzo-E.

c
c  Copy into flux slices (to return to caller)
c

!      do i=i1, i2+1
!         df(i,j) = dt*dub(i)
!         ef(i,j) = dt*(dueb(i) + upb(i))
!         uf(i,j) = dt*(duub(i) + pb(i))
!         vf(i,j) = dt*duvb(i)
!         wf(i,j) = dt*duwb(i)
!      enddo

       do i=i1, i2+1
          qc = dt/dx(i)
          df(i,j) = qc*dub(i)
          ef(i,j) = qc*(dueb(i) + upb(i))
          uf(i,j) = qc*(duub(i) + pb(i))
          vf(i,j) = qc*duvb(i)
          wf(i,j) = qc*duwb(i)
       enddo
jobordner commented 3 years ago

@mabruzzo FYI I got compile errors in your conserved-flux-correct branch after your latest commit due to CelloArray not being defined, but adding an include for enzo_CelloArray.hpp in _problem.hpp fixed it.

mabruzzo commented 3 years ago

Yeah, I had done that locally and just forgot to commit it. I just added it to the branch.

I don't think that the use of CelloArrays is really adding anything right now. If it's your preference, we could drop that change later on

mabruzzo commented 3 years ago

@jobordner @pgrete @forrestglines FYI, I just fixed another bug in my branch (that I had introduced myself)

EDIT: I just pushed the commit

mabruzzo commented 3 years ago

There seem to be some inconsistencies between what yt shows and what I see in the simulation (According to the terminal output, the y-momentum becomes non-zero at cycle 1, but that doesn't seem to be the case when looking at it in yt).

I still need to try resolving it, but I need just wanted to share this plot of the nonzero y-velocities at cycle number 2 for this parameter file. In this plot, each grid is showing an x-y slice at a different z-value.

image

forrestglines commented 3 years ago

@mabruzzo This is the raw data spanning the entire xy-plane through the domain, right?

This is also a change in refinement across a periodic boundary. Does the same behavior happen if the fine and coarse grids are reversed in x? Can you get the same behavior in the middle of the grid, perhaps by adding more refinement zones?

Does changing to reflecting/outflow conditions remove the nonzero y-velocities?

@jobordner As currently implemented, do the flux corrections account for flux correcting across refinement changes across a periodic boundary?

mabruzzo commented 3 years ago

After taking another look, I think the concerns I mentioned about exactly when the issue shows up in the simulation vs when it shows up in the yt-visualized data might be a non-issue (I always get a little confused about the order of operations for when exactly outputs are written out).

There doesn't seem to be an issue in the y-velocity after the very first update, but maybe there is an issue in the other quantities? (It might be worth looking at the L0 error norm for that reason).

I'm also thinking about switching to a contact wave since the flux calculations should be even simpler in that case.

This is the raw data spanning the entire xy-plane through the domain, right?

This is also a change in refinement across a periodic boundary.

@forrestglines That is correct. All of your ideas make a lot of sense! I'll try getting answers to you about those questions in the next few minutes.

@jobordner When initializing an AMR simulation what refinement level are the initializers applied at? Are the initializers just applied at the root level? Or do the initializers get called again after the initial refinement?

mabruzzo commented 3 years ago

I just tried out this parameter file, (this is data at the same cycle) and it looks like it's not (just) an issue with periodic boundary conditions. I think this may point to an indexing problem.

I realize that's not exactly what @forrestglines is looking for. I'll try to be more systematic going forward.

image

mabruzzo commented 3 years ago

@jobordner @forrestglines @pgrete

I'm confident that I tracked down the location of the problem: it seems to be in the prolongation-refresh step.

I determined that while running a contact wave (i.e. it has constant pressure). The input file is provided here. You can reproduce this by checking out my branch and overwriting:

It turns out that at cycle 2, the y-velocity component is given a non-zero value (It doesn't matter what optimization level your compiler uses). Then the flux correction propagates this back to the coarser grid.

An image is shown here of the y-velocity to show you what's going on ![image](https://user-images.githubusercontent.com/28722054/119206671-cedf3880-ba69-11eb-914c-630538892238.png) <\details>
mabruzzo commented 3 years ago

Upon slightly further inspection, the prologation-refresh step might not actually be the root cause of the problem (but it still seems to have issues). It turns out that on the very first refresh, not all values of a given field at a give x-index in the ghost zone share the same value (when they absolutely should - I think this is probably something that needs to be addressed regardless). But the magnitude of the discrepancy seems to be so small that there isn't much difference. Something else seems to be happening during EnzoMethodPpm::compute during the following cycle (cycle 1).

jobordner commented 3 years ago

@mabruzzo it looks like your block size is 4^3? Not sure it's the issue, but the block size should be at least twice the ghost zone depth for AMR. Ghost zones overlapping non-adjacent blocks is not supported.

mabruzzo commented 3 years ago

Yeah, you're absolutely right. After making that change, it seems like the imperfect conservation of the transverse momentum for the contact wave is a property of the ppm solver. I'll look at this more on Monday

jobordner commented 3 years ago

@mabruzzo FYI your fix to scale fields in the make_field_conservative group before applying the correction makes a noticeable improvement in my flux correction tests for both momentum and total_energy. @stefanarridge you may want to incorporate these changes (345435fd) to see if that addresses your issue #87.

mabruzzo commented 3 years ago

@jobordner Yeah, I was noticing that too. I'm going to make a PR merging in that particular change to the main branch tonight/tomorrow.

I'm also going to make another post here summarizing the remaining issues. I can reproduce the remaining issues without including flux corrections). It's possible that's it's just the integrators acting funky (I find that somewhat unlikely since I can reproduce it with both ppm and VL+CT)

mabruzzo commented 3 years ago

@jobordner @pgrete @forrestglines

So I did a lot of digging at the end of last week and yesterday (and made a minor breakthrough this morning), and I think I've identified all of the remaining problems.

As I mentioned near the end of last week, I decided to use an Entropy/contact wave instead of a sound wave (for simplicity). Just so we're on the same page, a contact wave is initialized with

As was mentioned in the preceding posts, my patch that converts all quantities to conserved form while applying the flux corrections significantly improves conservation (I plan to clean it up a little and submit a PR later today). But there are still some outstanding issues.

Here is a table listing the number of conserved digits for a list of 3 different runs (as printed by MethodFluxCorrect). Note that the number of conservation digits for total_energy, velocity_x, velocity_y, and velocity_z was actually computed for the density multiplied by each of these fields (i.e., energy density, x-momentum density, y-momentum density, z-momentum density).

Unigrid N32 Unigrid N64 AMR
density 16.181 15.7808 8.57013
total_energy 15.8899 15.1277 9.01729
velocity_x 15.1037 14.9452 8.57013
velocity_y 15.257
velocity_z 15.6889

In each simulation, the domain extends from (0.0, 0.0, 0.0) to (1.0, 0.25, 0.25). In AMR and Unigrid N32 runs, a block consists of 8^3 cells, and the root grid consist of (32,8,8) cells. There is only 1 level of refinement in AMR for the densest gas (due to a minor oversight the refined blocks are never coarsened). In Unigrid N64 a block consists of 16^3 cells and the root grid consists of (64, 16, 16) cells.

The table highlights 2 outstanding problems (that pertain to ProlongLinear and are independent of the flux corrections):

  1. The AMR run has much worse conservation of density, total_energy, and velocity_x than either unigrid run. While running the simulation, each of these fields are conserved to >15.5 cycles for the first 10 cycles, and then during cycle 11 they are only conserved to ~9.2 digits. I have confirmed that this deterioration in conservation coincides with the first time a root grid is refined (post initialization). If somebody know how to do so (I don't), it would helpful to see how well Enzo or Athena++ conserves these quantities following a refinement just so that we know what we should be aiming for.

  2. Why are there any x positions where a field can have a different value at different y-values and z-values? The waves are uniformly initialized and should just evolve in the x-direction and have no evolution in the y- or z- directions (this problem manifests as the imperfect conservation of velocity_y and velocity_z). This problem occurs on the fine grid (without flux corrections) for both the PPM and VL+CT integrators. It turns out that there are small 1e-16 deviations in the ghost zones for the density and total_energy fields that I think are caused by ProlongLinear (these deviations probably aren't present in the other fields because they are uniform) It makes sense that there aren't issues in the other fields since they are all uniform. I identified this issue by inspecting the ghost zones in the outputs of a run without any methods, other than the Null method (to force the refresh). I assume this has something to do with ProlongLinear's use of extrapolation. I would assume we can fix this after @jobordner merges in his branch (which will allow us to prolong using values from the neighboring blocks). This problem isn't a significant issue, but it's probably worth fixing to verify that it isn't obfuscating any other problems.

Here is tar file with some python scripts and the input files I used for debugging these problems: testing.tar.gz (Apologies that they are a little messy).

I'm not sure when I'll have time to come back and address the first issue (either fix it or confirm that other codes don't do any better). So somebody else (not necessarily somebody in this thread) may want to take a stab at it.

Once we address the first outstanding problem, it would probably be worth running a convergence study with the inclined hydrodynamic waves (and AMR) to verify that we get the expected convergence. I'm very happy to help out with this. I've done this before (without AMR) for the VL+CT integrator, so I have some scripts lying around to make this easier.

mabruzzo commented 3 years ago

@forrestglines I'm pretty confident that I have working flux corrections for the VL+CT integrator. I decided to take one last look at it tonight, and it turns out that I found the last bug a few days ago (I was just getting weird results because I forgot to set the Courant factor to something sensible). You can find the updated source terms on this branch here (this is different from the branch I shared last week).

Hopefully this saves you some time (unless you already solved this independently).

To be clear, I haven't tested this too rigorously. However, using this parameter file, I have managed to replicate the conservation of each quantity to roughly the same level of precision as the PPM integrator for the x-axis aligned entropy wave.

As an aside, if you use the commented out Adapt section, (which is equivalent to static mesh refinement), you get substantially improved conservation. The density, total energy density, and x-momentum density are all conserved to upwards of 15.7 digits (This reaffirms my belief that the remaining problems are associated with ProlongLinear)

forrestglines commented 3 years ago

@forrestglines I'm pretty confident that I have working flux corrections for the VL+CT integrator. I decided to take one last look at it tonight, and it turns out that I found the last bug a few days ago (I was just getting weird results because I forgot to set the Courant factor to something sensible). You can find the updated source terms on this branch here (this is different from the branch I shared last week).

Very good work @mabruzzo! I haven't looked at this since last Friday, but I was following your comments. I'll try to take a look at your fixes on Monday, although I'll probably be very busy for the next two weeks with proposal writing.

To be clear, I haven't tested this too rigorously. However, using this parameter file, I have managed to replicate the conservation of each quantity to roughly the same level of precision as the PPM integrator for the x-axis aligned entropy wave.

I can double check your testing when we put together a PR.

As an aside, if you use the commented out Adapt section, (which is equivalent to static mesh refinement), you get substantially improved conservation. The density, total energy density, and x-momentum density are all conserved to upwards of 15.7 digits (This reaffirms my belief that the remaining problems are associated with ProlongLinear)

So when we remove the Adapt section, we remove adaptivity, but we still have multiple levels of refinement i.e. SMR. In that case we get machine precision conservation. I agree then that the remaining conservation issues are in ProlongLinear. Might we be making the same mistake where we're prolonging velocities when we should be prolonging momenta?

Also, how should we incorporate this branch? Should we merge these changes with PR #94 or make a new PR? I can tackle a new PR but it would have to wait two weeks.

mabruzzo edit: I mistakenly edited your comment while responding. I returned it back to what it originally said

mabruzzo commented 3 years ago

So when we remove the Adapt section, we remove adaptivity, but we still have multiple levels of refinement i.e. SMR.

Yes, I think you understanding is correct.

Technically, we're switching our adapt criterion from a density threshold to a "mask". For simplicity, it refines all root blocks everywhere where x<0.5 (the x boundaries of our domain are 0.0 and 1.0). The initial refinement is performed on startup (so the refined blocks are directly initialized by the problem initializer) and the refinement never changes (hence SMR).

In that case we get machine precision conservation. I agree then that the remaining conservation issues are in ProlongLinear. Might we be making the same mistake where we're prolonging velocities when we should be prolonging momenta?

I don't think so. At a glance, the control flow is a little hard to follow, but I think that FieldFace calls the Prolong class during refinement and refinement. And I think that FieldFace converts quantities to conserved form before prolongation, and converts back to the initial form after prolongation. @jobordner, please correct me if I'm wrong.

In the SMR run, some prolongation is currently happening during the refresh. But I don't think that needs to be conservative (the flux correction corrects for that). A convergence study might be telling. But, currently ProlongLinear is not monotonic (I think changes introduced by PR #97 will allow it to be) so that might obfuscate the results of such a study.

I haven't dug very deep, and I think PR #97 might change alter how Prolong is called (but I could be wrong).

Also, how should we incorporate this branch? Should we merge these changes with PR #94 or make a new PR? I can tackle a new PR but it would have to wait two weeks.

I'd prefer that we make a separate PR (so that we can include changes from both PR #94 and #108, the flux-correction bugfix). In the next few days, I might add another commit to try to cleanup the implementation a little more (in particular, the code used to register the buffers for storing the fluxes is a little messy). But, I might let you put the actual PR and tests together. Does that sound like a reasonable plan?

EDIT: I looked at this briefly over the weekend and added a test (based on the test that I introduced to PR #108) that appears to demonstrate that flux correction work in 3D. I do need to add a version of the test that have different expected test results for single-precision mode. I'll try to finish cleaning up the implementation this afternoon or tomorrow