argonne-lcf / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
9 stars 12 forks source link

Distributed loading v2 #21

Closed zhenghh04 closed 4 months ago