Once a standalone lm-eval-harness script is created in https://github.com/Zyphra/Megatron-LM/issues/11, we will replace the periodic validation loss check with a few important lm-eval-harness tasks like lambada.
It would be much more useful to see how those change over time rather than a random validation set, and it'll be tedious to run the standalone script for every checkpoint ourselves.
Once a standalone lm-eval-harness script is created in https://github.com/Zyphra/Megatron-LM/issues/11, we will replace the periodic validation loss check with a few important lm-eval-harness tasks like lambada.
It would be much more useful to see how those change over time rather than a random validation set, and it'll be tedious to run the standalone script for every checkpoint ourselves.