Why is evaluation run on every checkpoint that the model is trained on? I see the function _best_trial_info is new to ALBERT and was not there in BERT run_classifier. I fine-tuned my ALBERT model on 75000 steps and it seems to be evaluating on all these checkpoints?
Why is evaluation run on every checkpoint that the model is trained on? I see the function _best_trial_info is new to ALBERT and was not there in BERT run_classifier. I fine-tuned my ALBERT model on 75000 steps and it seems to be evaluating on all these checkpoints?