AMReX-Astro / MAESTROeX

A C++ low Mach number stellar hydrodynamics code
https://amrex-astro.github.io/MAESTROeX/
BSD 3-Clause "New" or "Revised" License
40 stars 22 forks source link

Restarting from 0th checkpoint does not work #130

Open harpolea opened 4 years ago

harpolea commented 4 years ago

If a problem is restarted from a checkpoint output at the 0th timestep, then the timestep evolution itself appears to run fine, however the time at the end of the first timestep is 1e99, causing the program to then terminate. I suspect that something is not being initialized correctly?

doreenfan commented 4 years ago

Seems to be fixed by commit d5d5ef1fa1f9562b9ec4b4f306468e07c5c8b61a

ajnonaka commented 4 years ago

When restarting from checkpoint 0, the chosen dt looks correct now but there are other issues with restart in general. Here is a summary of what I found. These are reacting_bubble tests with 4 MPI (0 OMP).

  1. inputs_3d_regression restarting from chk0000001 works fine.

  2. inputs_3d_regression restarting from chk0000000 dies during Step 3 (create MAC velocities) with an Erroneous arithmetic operation

  3. inputs_3d_amr_regression restarting from chk0000001 dies in the Step 3 MAC projection - MLMG fails to converge

  4. inputs_3d_amr_regression restarting from chk0000000 dies in the Step 2 (make w0) with an Erroneous arithmetic operation

ajnonaka commented 4 years ago

edit: in regards to (2.), the output w0 of make_w0 is complete garbage at r=1 and higher.

ajnonaka commented 4 years ago

If you write out VisMF::Write(S_cc_new[0],"a_S_cc_new"); at the beginning of AdvanceTimeStep() it contains nonsensical values. Maybe the way S_cc_new is initialized if you restart from checkpoint 0 is the problem. This is with 1 MPI process.

doreenfan commented 4 years ago

Commit cfa5ed7b seems to have resolved (2.). Commit d52443ea resolves (3.) and (4.).

ajnonaka commented 4 years ago

An update on the 4 reacting bubble test problems (4 MPI, 0 OMP):

  1. inputs_3d_regression restarting from chk0000001 works fine.

  2. inputs_3d_regression restarting from chk0000000 runs to completion, but the diffs are large.

  3. inputs_3d_amr_regression restarting from chk0000001 runs to completion, with small diffs (10^-9). Jury still out.

  4. inputs_3d_amr_regression restarting from chk0000000 runs to completion, but the diffs are large.

ajnonaka commented 3 years ago

Works for inputs_2d_regression restarting from chk0000000; so it appears to be a 3D issue only.