argonne-lcf / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
7 stars 8 forks source link

Pull in changes from `microsoft/Megatron-DeepSpeed` #35

Open saforem2 opened 2 months ago