bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.32k stars 213 forks source link

MTF optimize dataloading #298

Closed thomasw21 closed 2 years ago

Muennighoff commented 2 years ago

Nice, re-creating an indexed training dataset with this branch to compare to the previous one

thomasw21 commented 2 years ago

Let's go!