Models aren't saved after fine-tuning depending on the save steps

alexeifigueroa commented 3 years ago

if say, the save steps are greater than the total global steps the fine tuned models are not saved https://github.com/facebookresearch/bio-lm/blob/552de5c98fc421758f753d73162b9c84f0e755b2/biolm/run_classification.py#L239 and the code crashes https://github.com/facebookresearch/bio-lm/blob/552de5c98fc421758f753d73162b9c84f0e755b2/biolm/run_classification.py#L713 The save steps always have to be set to a smaller number and there's never a "final" version of the model when the training is 100% complete (unless you set the save steps to a divisor of the total steps if you happen to know them beforehand).

usuyama commented 3 years ago

We can add or (t_total == global_step) at L239 to run the checkpoint/validation at the end of training.

alexeifigueroa commented 3 years ago

facebookresearch / bio-lm