JuliaNLSolvers / Optim.jl

Optimization functions for Julia
Other
1.12k stars 217 forks source link

Checkpointing in Optim #884

Open umbriquse opened 3 years ago

umbriquse commented 3 years ago

I've come across the scenario of using an optimization algorithm, but did not have the time in the rest of the day to let the optimizer complete and find a solution. I was curious if you thought about putting a checkpointing system in Optim to allow users to stop their algorithms at an arbitrary point and continue off at that point at another time (i.e. Save the progress that the optimizer has already made)

pkofod commented 3 years ago

How would you stop it? With SIGINT ?

umbriquse commented 3 years ago

Yes, but also in the situation of either power failure, an error occurring in the user code after some iterations, or some segfault/ bug in julia that hasn't been corrected. Mostly with SIGINT though.

umbriquse commented 3 years ago

This idea comes from another package that I am using called DFTK. They rely on using JLD2, and save a snapshot of the current state so that the progress made isn't lost and can be continued at another time. If you're curious it's in the jld2io.jl folder.

pkofod commented 3 years ago

Okay okay, I think I may understand. If you're that worried about sudden problems I would suggest you use a callback to save the current state. Does that make sense?

umbriquse commented 3 years ago

Yes, I thought it wasn't possible to save the progress of the state and then start an optimizer in at that saved state when using any callback function. Thank you, I'll do that.

pkofod commented 3 years ago

It's not super easy to start it again, but I can help you set it up if you need something like this.

umbriquse commented 3 years ago

Thank you for offering, but I think I can work it out. I am curious if you would like the function that I would make though? So others could add that function into the callback and save the state of their optimizers.

pkofod commented 3 years ago

Yeah, please show it. If nothing else we can put it in the docs :)