IBM / mi-prometheus

Enabling reproducible Machine Learning research
http://mi-prometheus.rtfd.io/
Apache License 2.0
42 stars 18 forks source link

GridTester only loads model_best.pt #103

Closed vmarois closed 5 years ago

vmarois commented 5 years ago

Currently playing with the GridTester to handle the ViGIL experiments. When finetuning, I am encountering a situation where the model does not get saved at each (or frequently enough) validation step, as the validation loss is not decreasing. It'd be nice if the GridTester could search for the most recent saved_intermediate model and load it instead of model_best.pt

tkornuta-ibm commented 5 years ago

Wait... what?

First of all, you mentioned fine-tuning, so we are talking about GridTrainer, not GridTester, right? ;)

Then, you are saying that sometimes the best_model.pt is not saved at all? How is that possible... as far as I know during fine-tuning (i.e. "training with the pre-trained model loaded") we do not take into account the accuracy stored in the loaded checkpoint... so it should save at least the best_model.pt at the 0 step, right?

And if that is the best model, then it is - I see no point of using a different one... Which one? The last one? Why?

vmarois commented 5 years ago

Closing this for now. Not really urgent anyway.