bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.31k stars 213 forks source link

Add multiple evaluation compat #336

Open Muennighoff opened 2 years ago

Muennighoff commented 2 years ago

This is still too hacky to be merged 😁

cc @TevenLeScao

Edit: Will make it less hacky on the other branch so merging this