argonne-lcf / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
7 stars 8 forks source link

`flash-attn` fix + new Frameworks on Sunspot #13

Closed saforem2 closed 4 months ago

saforem2 commented 4 months ago