[ENHANCEMENT] Replace periodic validation with eval harness calls

Once a standalone lm-eval-harness script is created in https://github.com/Zyphra/Megatron-LM/issues/11, we will replace the periodic validation loss check with a few important lm-eval-harness tasks like lambada.

It would be much more useful to see how those change over time rather than a random validation set, and it'll be tedious to run the standalone script for every checkpoint ourselves.

Zyphra / Megatron-LM

[ENHANCEMENT] Replace periodic validation with eval harness calls #17