We need a more accurate measurement of model quality.
However, online evaluation (current approach) incurs much overhead to achieve the goal.
So we'd like to take an approach: offline-evaluation.
1) checkpoint a model periodically during training
2) restore and evaluate the model after training
We need a more accurate measurement of model quality. However, online evaluation (current approach) incurs much overhead to achieve the goal.
So we'd like to take an approach: offline-evaluation. 1) checkpoint a model periodically during training 2) restore and evaluate the model after training