LLNL / serac

Serac is a high order nonlinear thermomechanical simulation code
BSD 3-Clause "New" or "Revised" License
178 stars 31 forks source link

Dynamic history in memory #1086

Closed jamiebramwell closed 5 months ago

jamiebramwell commented 5 months ago

This PR optionally saves the transient history information in memory instead of writing and reading from disk. The disk IO option was causing significant slowdowns on parallel file systems.

jamiebramwell commented 5 months ago

Do we have a test for the case when save_to_disk is true? I may be missing something, but is the behavior different in the 2 cases now? If !save_to_disk, we auto checkpoint after every nonlinear solve, but I'm not seeing if that is still happening if save_to_disk==true? I may just be reading to quickly. Previously, this required a call to 'outputStateToDisk()'?

Good point. I'll auto-disk checkpoint if this option is on.

jamiebramwell commented 5 months ago

While trying to clean up the names, I came up with this more standardized approach using BasePhysics methods. Let me know what you think!

codecov-commenter commented 5 months ago

Codecov Report

Attention: Patch coverage is 86.08696% with 16 lines in your changes are missing coverage. Please review.

Project coverage is 86.97%. Comparing base (98d9933) to head (95c51fb).

Files Patch % Lines
src/serac/physics/solid_mechanics.hpp 77.55% 11 Missing :warning:
src/serac/physics/base_physics.cpp 76.47% 4 Missing :warning:
src/serac/physics/heat_transfer.hpp 97.43% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #1086 +/- ## =========================================== + Coverage 86.91% 86.97% +0.05% =========================================== Files 159 159 Lines 15412 15473 +61 =========================================== + Hits 13396 13458 +62 + Misses 2016 2015 -1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

jamiebramwell commented 5 months ago

@kswartz92 , I reverted the API change. Internally, we are now group caching the requested cycle and just pulling from that if a different field from the same cycle is requested. This should reduce disk thrashing. Let me know if this works for you.

jamiebramwell commented 5 months ago

Do we have a test for the case when save_to_disk is true? I may be missing something, but is the behavior different in the 2 cases now? If !save_to_disk, we auto checkpoint after every nonlinear solve, but I'm not seeing if that is still happening if save_to_disk==true? I may just be reading to quickly. Previously, this required a call to 'outputStateToDisk()'?

I made two of the thermal dynamic adjoint tests use disk-based checkpointing and now automatically output to disk every timestep if this option is active.