issues
search
microsoft
/
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.9k
stars
346
forks
source link
Mistral
#365
Closed
Kosei1227
closed
8 months ago