SciML / OrdinaryDiffEq.jl

High performance ordinary differential equation (ODE) and differential-algebraic equation (DAE) solvers, including neural ordinary differential equations (neural ODEs) and scientific machine learning (SciML)
https://diffeq.sciml.ai/latest/
Other
546 stars 207 forks source link

Ability to transparently restart time integration for adaptive integration schemes #1972

Open sloede opened 1 year ago

sloede commented 1 year ago

Is there a way to transparently restart the time integration with an integrator that has internal state (beyond the current iteration and current time), such as adaptive error-based time integration schemes?

Motivation: In Trixi.jl, we save intermediate checkpoint files (or "restart files") for long-running simulations. When we decide to continue the simulation, we can restart it from an existing restart file. Our solvers are set up such that the restart is transparent - that is, there is no difference in the solution whether it was obtained from a single simulation or from one that was restarted multiple times.

This already works great for classical Runge-Kutta schemes, but adaptive time integration schemes have an internal state that needs to somehow be preserved. That is, it would have to be stored to a file on disk and then loaded again once a simulation continues.

My question thus is, whether there already exists such an ability in OrdinaryDiffEq, and if not, if there are maybe useful hacks one could employ to achieve the same thing. Note that our restart files are written in HDF5, so we wouldn't want to rely on something like BSON. Thus, if there is some serialization process available, something ASCII based would be preferable (although I guess we could also store binary data in the restart file)

cc @simoncan @ranocha

ranocha commented 1 year ago

I think we just have to save the controller in our restart files (manually), load them from there, and pass them to solve(ode, alg; controller, kwargs...). We will have to ensure ourselves that we use the same time integration method, callbacks, etc. though.

Thus, the only question is whether the internal structure of the controllers (mainly PIController and PIDController) can become part of the public API or whether we should implement a conversion to/from a simple array or something like that. I guess the last approach would probably be best.

ChrisRackauckas commented 1 year ago

Or have a save/load interface on integrator that ensures it.

sloede commented 1 year ago

Or have a save/load interface on integrator that ensures it.

That would be great. That is, something that "serializes" the internal numeric state of an integrator, without storing the info on algorithms etc and without the actual data of course.

ranocha commented 1 year ago

@sloede and I discussed this a bit further to figure out the requirements. Status quo: It would be great for us to be able to serialize/deserialize the controller and integrator.stats (plus integrator.iter) - we already take care of integrator.t, integrator.dt, and integrator.u

SimonCan commented 1 year ago

I suspect one can use the integrator of OrdinaryDiffEq and save it together with the snapshots. When loaded that should ideally contain all information. We would then need to change from using the solve function to the solve! function. That modifies the integrator and with that dt. Without the integrator read and write, this is already done in e.g. examples/structured_2d_dgsem on the sc/restart_indices_fix branch.