autonomio / talos

Hyperparameter Experiments with TensorFlow and Keras
https://autonom.io
MIT License
1.62k stars 270 forks source link

Resume run possible? #545

Closed zonexo closed 3 years ago

zonexo commented 3 years ago

Hi,

my sch cluster restricts run to 24hr per job. So is it possible to implement some sort of resume from previous stopping point?

If so, how is it done?

Thanks!

mikkokotila commented 3 years ago

Related with #482 and #97. I think time to decide what to do with this.

At the moment, the "best" way would be to split the experiment into multiple experiments and then join the CSV logs at the end.

zonexo commented 3 years ago

ok thanks for the update!