oar-team / batsim

Batsim: Infrastructure simulator for job and I/O scheduling
GNU Lesser General Public License v3.0
30 stars 15 forks source link

Enhancement: Simulation checkpointing #8

Open mpoquet opened 7 years ago

mpoquet commented 7 years ago

It would be very nice to have some kind of checkpointing mechanism in Batsim, so long simulations could be stopped and resumed.

mpoquet commented 6 years ago

This could be achieved by using DMTCP on a simulation instance manager (e.g., robin).

It may work directly or DMTCP plugins might be needed. Such plugins allow to define how to checkpoint or restart specific parts (ZMQ?).