Open lewisgross1296 opened 7 months ago
As discussed in-person, perhaps an appropriate option for simpler continuation runs might be to add the ability to the Integrator
classes indicating that the timesteps provided are the original timesteps for the run and the integrator should pickup at the next timestep based on the timesteps present in the depletion_results.h5
file provided to the Operator
.
Validation of the original timesteps should include:
In the absence of this flag, the timesteps provided to the Integrator
would be treated as additional time steps in the calculation.
Description
Recently, I've been running some hefty depletion simulations on my HPC. They usually finish within the max runtime, but this time, they took longer and got killed mid simulation. I'm now in the business of restarting the simulation to finish the job.
One issue that I have with the way we do restarts is that
timesteps
is required to be provided in the case of a depletion restart. While the user should be able to figure out whattimeteps
will reproduce the intended depletion steps, I really would appreciate the option for the restart to just know what was initially requested and seamlessly pick up from where it left off. I don't think it would be hard to store the initial time steps requested, though I'm not sure how much effort is required to refactor to allow this. Mainly, I think it is easy to provide different restart timesteps from what you actually wanted.Quick anecdote as to why I think this would be beneficial:
I was playing around with this pincell depletion example and was able to cause a situation where the simulation was killed while writing the
openmc_simulation_n3.h5
file. The original simulation requeststime_steps = [1.0, 1.0, 1.0, 1.0, 1.0] # days
. In this case, when I restarted the simulation, I told it to restart withtime_steps = [1.0, 1.0] # days
. Because the simulation got killed midh5
write, it re-did the transport + depletion for that step and overwrote the faultyopenmc_simulation_n3.h5
file. This then counted as one of mytimesteps
provided.In this case, I was off by one because I thought it would start with
openmc_simulation_n4.h5
, but it really needed to redon3
. OpenMC didn't run the final eigenvalue sim + write theopenmc_simulation_n5.h5
, as I intended.On HPC, it is very possible to get killed while writing one of the
openmc_simulation_n<N>.h5
In a more costly simulation, I would just launch the job and come back expecting it to finish. I'd be sad if I came back later to HPC and saw that I actually needed to restart one more time / would maybe doubt that I did everything correctly and use up some time to verify what happened.Alternatives
Status-quo: requiring the user to figure out what
timesteps
to provide with a restart.Compatibility
The API would change by not requiring
timesteps
as an argument toopenmc.deplete.abc.Integrator
. In the case that it is not provided, the simulation should figure out what was initially requested and finish that set oftimesteps
. I don't think it would take up much memory to store this info indepletion_results.h5
.I think it would be easy to write a test for this API change as well.