Closed thvasilo closed 1 week ago
The current logics of these 3 arguments are:
===== Command Line Interface =======
topk_models_to_save
, we will check if save_model_frequency
is divisible by eval_frequency
. if not divisible, break and raise an error.topk_models_to_save
, models will be saved in two cases:
save_model_frequency
and eval_frequency
, e.g., 20k, 40k, 60k, ..., for 10k save_model_frequency
and 4k eval_frequency
. This is because current model saving depends on if model evaluation is done in the same iteration.no_validation
, always do validation according to eval_frequency
.best model path
is based on both evaluation and save model operations.===== API =======
topk_models_to_save==1
, which means only save the best one model.no_validation
.save_model_frequency
and eval_frequency
, e.g., 20k, 40k, 60k, ..., for 10k save_model_frequency
and 4k eval_frequency
. This is because current model saving depends on if model evaluation is done in the same iteration.Based on the current logic mentioned above, the design decision would be:
no_validation
) whenever save model?I will vote for the 2nd option from a user perspective.
In order to be able to save topk models during training we use their performance on the validation set to rank them.
Right now we rely on evaluation metrics calculated during "regular" evaluation runs that run every
eval_frequency
to do the ranking.We implicitly assume that
save_model_frequency
should be divisible byeval_frequency
to do so.However, when that is not true, the user's choice of
save_model_frequency
is not respected, which could even lead to no models being saved (?).We should change the behavior to always perform an eval at
save_model_frequency
steps ifftopk_models_to_save
is set by the user. Iftopk_models_to_save
is not set, we should just save the model without eval.I also observed that no evaluation is run in one case, (
save_model_frequency
10k,eval_frequency
4k). We'll need to investigate what happens there. I'll run some quick confirmation experiments for this.