bigscience-workshop / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.3k stars 211 forks source link

Fix various small problems #367

Open janEbert opened 1 year ago

janEbert commented 1 year ago

Problems:

This PR fixes these issues. Since the changes are so small, I didn't bother creating 3 PRs.