mpiraux / mpf

Minimal Performance Framework
0 stars 0 forks source link

Resuming experiment #17

Closed mpiraux closed 7 months ago

mpiraux commented 8 months ago

When an experiment fails, e.g. due to a full disk or unavailable SSH materials, there is no easy way to restart it.

Several steps for reproducibility are already in place: seeded variable values exploration, copy of experiment and cluster file to the experiment results directory.

mpf.run_experiment should be modified to yield partial DataFrame with intermediate results. Then, the last saved DataFrame could be passed again to mpf.run_experiment along with the experiment id to resume it.