Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM - Githubissues

pentium3 / sys_reading

system paper reading notes

235 stars 12 forks source link

Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM #282

Open pentium3 opened 1 year ago

pentium3 commented 1 year ago

https://arxiv.org/pdf/2104.04473.pdf