openmc-dev / openmc

OpenMC Monte Carlo Code
https://docs.openmc.org
Other
753 stars 484 forks source link

[feature request] Depletion restart without supplying timesteps #2871

Open lewisgross1296 opened 7 months ago

lewisgross1296 commented 7 months ago

Description

Recently, I've been running some hefty depletion simulations on my HPC. They usually finish within the max runtime, but this time, they took longer and got killed mid simulation. I'm now in the business of restarting the simulation to finish the job.

One issue that I have with the way we do restarts is that timesteps is required to be provided in the case of a depletion restart. While the user should be able to figure out what timeteps will reproduce the intended depletion steps, I really would appreciate the option for the restart to just know what was initially requested and seamlessly pick up from where it left off. I don't think it would be hard to store the initial time steps requested, though I'm not sure how much effort is required to refactor to allow this. Mainly, I think it is easy to provide different restart timesteps from what you actually wanted.

Quick anecdote as to why I think this would be beneficial:

I was playing around with this pincell depletion example and was able to cause a situation where the simulation was killed while writing the openmc_simulation_n3.h5 file. The original simulation requests time_steps = [1.0, 1.0, 1.0, 1.0, 1.0] # days. In this case, when I restarted the simulation, I told it to restart with time_steps = [1.0, 1.0] # days. Because the simulation got killed mid h5 write, it re-did the transport + depletion for that step and overwrote the faulty openmc_simulation_n3.h5 file. This then counted as one of my timesteps provided.

In this case, I was off by one because I thought it would start with openmc_simulation_n4.h5, but it really needed to redo n3. OpenMC didn't run the final eigenvalue sim + write the openmc_simulation_n5.h5, as I intended.

On HPC, it is very possible to get killed while writing one of the openmc_simulation_n<N>.h5 In a more costly simulation, I would just launch the job and come back expecting it to finish. I'd be sad if I came back later to HPC and saw that I actually needed to restart one more time / would maybe doubt that I did everything correctly and use up some time to verify what happened.

Alternatives

Status-quo: requiring the user to figure out what timesteps to provide with a restart.

Compatibility

The API would change by not requiring timesteps as an argument to openmc.deplete.abc.Integrator. In the case that it is not provided, the simulation should figure out what was initially requested and finish that set of timesteps. I don't think it would take up much memory to store this info in depletion_results.h5.

I think it would be easy to write a test for this API change as well.

pshriwise commented 7 months ago

As discussed in-person, perhaps an appropriate option for simpler continuation runs might be to add the ability to the Integrator classes indicating that the timesteps provided are the original timesteps for the run and the integrator should pickup at the next timestep based on the timesteps present in the depletion_results.h5 file provided to the Operator.

Validation of the original timesteps should include:

In the absence of this flag, the timesteps provided to the Integrator would be treated as additional time steps in the calculation.