NVIDIA / Megatron-LM

Ongoing research training transformer models at scale
https://docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start
Other
10.65k stars 2.39k forks source link

Fix: misnamed sharded instead of common in checkpoint #1289

Open prrathi opened 1 week ago