argonne-lcf / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
9 stars 12 forks source link

Merge `polaris-cuda122` branch into main #11

Closed saforem2 closed 6 months ago

saforem2 commented 6 months ago