NVIDIA / Megatron-LM

Ongoing research training transformer models at scale
https://docs.nvidia.com/megatron-core/developer-guide/latest/user-guide/index.html#quick-start
Other
9.25k stars 2.09k forks source link

Add dataset packing #802

Open shamanez opened 1 month ago

shamanez commented 1 month ago

I added dataset packing that is similar to the huggingface SFT trainer.