enzo-project / enzo-e

A version of Enzo designed for exascale and built on charm++.
Other
29 stars 35 forks source link

Conservation errors in PPM vs VLCT (pure hydro mode) #162

Open stefanarridge opened 2 years ago

stefanarridge commented 2 years ago

I have found that when running a Sedov blast problem with the PPM method, with periodic boundary conditions, mass and momentum are not perfectly conserved, in particular when the center of the blast is at the intersection between blocks, as opposed to in the middle of a block). All runs were done with double precision. See the figures below:

mmc_block_center_n_64_ppm mmc_block_intersection_n_64_ppm

However, this problem is not apparent when running with the VLCT solver:

mmc_block_center_n_64_vlct mmc_block_intersection_n_64_vlct

I am attaching a tarball containing parameter files for all the cases I ran, as well as python scripts used to produce the plots: sedov_conservation_tests.tar.gz

My suspicion is that the fluxes at the block boundaries are not perfectly symmetric - it's the only reason I can think of for why mass and momentum would not be conserved, and why the behaviour is different depending on where the center of the blast is, relative to the edges of the block. Note how, in the case where the blast center is in the middle of the block, the errors are very small up to a certain point, when they suddenly start to increase. I haven't tested this rigorously, but I suspect that this corresponds to the time where the blast wave reaches the edge of the block.

I also suspect that this explains what is going on in Issue #87 , although things are complicated there by the addition of a gravity solver.

Edit: I see now that this was already discussed in #56 , and potentially addressed in PR #108. I will try once again to turn on the flux_correct method in the PPM runs, but I am a bit unclear as to whether flux_correct does anything useful in a non-AMR run.

@jobordner, do you know what might be causing this? Does the flux_correct method do anything to deal with global conservation errors? I tried running with flux_correct but it didn't appear to change anything?

@mabruzzo , what does the VLCT method do differently that ensures mass / momentum conservation? Also, out of curiousity, does the courant parameter have a different meaning for the VLCT method than for the PPM method? I set it equal to 0.3 in all cases, but I found that the VLCT runs reached the same physical time in significantly fewer cycles.

stefanarridge commented 2 years ago

Update, re-ran the tests with the PPM method, but this time with the flux_correct method as well. The results appear to be identical to the tests without flux_correct:

mmc_block_intersection_n_64_ppm_flux_correct mmc_block_center_n_64_ppm_flux_correct

Here is a tarball containing parameter files and plots for these runs (as well as higher resolution runs). sedov_conservation_tests_with_flux_correct.tar.gz

mabruzzo commented 2 years ago

This is fascinating! It would be cool if you could also add roundoff errors for "total_energy" to these plots (this should only be conserved when you use don't use the dual energy formalism).

@mabruzzo , what does the VLCT method do differently that ensures mass / momentum conservation?

I don't have a simple answer for you, but I have a theory.

These plots actually make me feel somewhat vindicated about my efforts.

As an aside: one idea I've mulled over for a long-time (we'll see if I ever get to it - there isn't any scientific payoff to it), is to gradually port the reconstruction and riemann solver routines from the PPM solver to C++ and allow them to be used interchangeably between the VL+CT solver and PPM solver. Doing this could make it a lot easier to track down the issues.

It's also my intention to eventually modify some of the hydro-testing scripts so that we can easily apply the same test problems to the Ppm solver and the VL+CT solver... (but I'm holding off until after the CMake PR is complete).

Also, out of curiousity, does the courant parameter have a different meaning for the VLCT method than for the PPM method? I set it equal to 0.3 in all cases, but I found that the VLCT runs reached the same physical time in significantly fewer cycles.

So I looked into it and the way we determine the maximum timesteps are different. In each case, the solvers compute dt for every cell in the active region of the system and then take then use the minimum value of dt for the timestep:

In each case, c0 is the courant factor. Note: the PPM supports larger courant values (a comment in calc_dt.F says that it supports values up to 0.8). In contrast, the VL+CT solver requires a courant factor <=0.5.

stefanarridge commented 2 years ago

@mabruzzo, I followed your suggestion and I have remade the plots, adding in a line for the total energy. Note that I made a slight change to the python script which makes the plots: it now replaces any zeros in the "error arrays", with a small value (1.0e-100). This is done so that taking the log of the absolute value of the error arrays doesn't result in a NaN. This means that if, at some index i in the array, the value of index i is non-zero but the value at indices i-1 and i+1 are zero (before replacing the value with 1.0e-100), then the non-zero value at index i shows up as a spike in the plot, rather than not showing up at all, as was happening previously.

Anyway, here are the plots:

conservation_plot_block_center_n_64_ppm conservation_plot_block_intersection_n_64_ppm conservation_plot_block_center_n_64_vlct conservation_plot_block_intersection_n_64_vlct

Here is my python script used to make these plots (had to add the .txt extension so that I could upload it here): conservation.py.txt

It's not clear from the plot, but in the "blast at block intersection" run, with VL+CT method, the mass error and total energy error is always exactly zero. Here is output from my conservation.py script which prints out the maximum absolute value of the conservation errors for each quantity:

Maximum mass error = 0.0
Maximum x momentum error = 5.421010862427522e-20
Maximum y momentum error = 2.710505431213761e-20
Maximum z momentum error = 3.138406488605874e-20
Maximum total energy error = 0.0
pgrete commented 2 years ago

Two additional comments:

stefanarridge commented 2 years ago

the left flux used in cell i is the right flux used in cell i+1)

This is what I meant by fluxes being symmetric (although I think you meant to have "left" and "right" switched, assuming cell i+1 is to the right of cell i).

My suspicion is that this property is broken at the block boundaries somehow, which is why I get different results when I move the center of the blast. I also think this might explain the results I reported in Issue #87.

mabruzzo commented 2 years ago

@mabruzzo, I followed your suggestion and I have remade the plots, adding in a line for the total energy. This is very cool! Thanks for doing this!

I don't want to make more work for you, but it might also be insightful to run Ppm solver in 1D and 2D and see if the problem persists (as an aside, the VL+CT solver currently only supports 3D problems)

stefanarridge commented 2 years ago

@mabruzzo , I have done some more tests with the PPM solver in 1D and 2D. Plots below:

conservation_plot_block_center_n_64_ppm_2d conservation_plot_block_intersection_n_64_ppm_2d conservation_plot_block_center_n_64_ppm_1d conservation_plot_block_intersection_n_64_ppm_1d

Here's a zip file containing the parameter files I used for these tests, well as the python script used to make the plots: param_files_and_python_script.zip

The main takeaways as far as I can see:

Interesting results in any case, curious to hear other people's thoughts on this.

jobordner commented 2 years ago

I think this is the same bug I ran into 10+ years ago. I think I got as far as tracing it to small errors in velocities creeping in from the ghost zones. From test/index.php in the "Method:ppm" tests:

Currently, "serial" results are incorrect for multiple blocks, which is to be expected. There are slight discrepencies in parallel runs eight blocks because the final time after 400 cycles does not exactly match the time for the serial runs. The results look qualitatively correct however, even at time 2.5 for 4002 (over 13000 cycles).

To test it I compared values cell-by-cell, including ghost zones, for the same problem run on one block verses multiple blocks.

gregbryan commented 2 years ago

Thanks everyone for these results (and apologies for chiming in late). This is clearly an issue with the PPM solver, which was not carefully written to be symmetric and, because it is split, the transverse momenta are not staled correctly. The fact that the 1D run is much better makes me suspect it is the corner cases. I doubt that changing the number of ghost zones would not significantly help this issue because the 1D sweeps only from guarantee the active cells are correct in the sweep direction, but it would be worth testing this (I think easy to do with the Cello framework?).