AMReX-Astro / Castro

Castro (Compressible Astrophysics): An adaptive mesh, astrophysical compressible (radiation-, magneto-) hydrodynamics simulation code for massively parallel CPU and GPU architectures.
http://amrex-astro.github.io/Castro
Other
293 stars 99 forks source link

Can we write the advance algorithm the way non-subcycling AMR algorithms look? #2183

Open maxpkatz opened 2 years ago

maxpkatz commented 2 years ago

A non-subcycling AMR version of our code (in the Strang build, not the simplified SDC build) would do the following (assumes two levels for simplicity, and ignores burning for now):

(1) Compute old-time coarse-level source predictor (first-order accurate) and apply to state (2) Compute second-order accurate hydro source on coarse level

(3) Do old-time Strang burn on fine-level (directly updates the state) (4) Compute old-time fine-level source predictor and apply to state (5) Compute hydro source on fine level and apply to state (6) Correct coarse level hydro source with fine level hydro source (7) Compute new-time fine-level source corrector and apply to state (makes the advance fully second-order accurate)

(8) Apply hydro source on coarse level (9) Compute new-time coarse-level source corrector and apply to state

For example, this is what the FLASH paper suggests. But Castro does (8) and (9) before the fine-level, and then correct the hydro and non-hydro source terms on the coarse level during the reflux operation. The primary downside of our approach is that it involves a re-calculation of the source terms on the coarse level. If the source terms are simple, this is cheap, but when we are using gravity, the cost of the correction solve on gravity can be expensive. Section 6.2.3 of the Castro paper comments that this is not so bad because the correction solve should be easier than the initial level solve. However, the gravity solve is so expensive on GPUs relative to everything but burning that having three solves per level rather than two really does pose a serious cost.

Berger and Colella (1989) use subcycling but agree with this modification, emphasizing that the flux correction should be done as a separate step after both the fine grids and coarse grids are done with the main advance. But they were describing a pure hydrodynamics algorithm without sources, so from a cost perspective it was probably fairly neutral (the flux correction operation for hydro alone is cheap). The question is: can we have steps 3-7 be done for multiple subcycles on the fine level before steps 8-9? After all, Berger and Colella's argument is only that doing it in this modified order makes the code simpler, which is an aesthetic consideration, not a physics/math consideration.

The argument for why we need to do it this way when subcycling is that we may need to collect information from the coarse grid during the fine grid advance. For example, a FillPatch operation on the fine grid will borrow data interpolated in time from the coarse grid on the boundaries of the fine grid, or a regrid during the fine timesteps may interpolate from the coarse grid to fill the data. So we need to have a valid estimation of U^{n+1} on the coarse grid while doing the fine steps.

Given that, is there anything we can do to avoid calculating the source terms three times per step? Idea/proposal to follow in a subsequent comment.

maxpkatz commented 1 year ago

One possibility would be to decide that it's OK to not have applied the new-time sources yet, for the purposes of interpolating coarse grid data onto the fine grid. The data would be first-order in time for the sources, and second-order in time for the hydro. The new-time sources would all be applied in post_timestep(), so the overall algorithm might still be second order accurate, but we would have to test this to see if there's meaningful degradation in quality compared to the current approach.

zingale commented 9 months ago

has this been done?

maxpkatz commented 9 months ago

It has been done for problems which are not subcycling on any level. Work still needs to be done on cases where some levels subcycle but others might not, and I'm still not sure what improvements are impossible for the subcycling case.