alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
674 stars 94 forks source link

Implement Sequence Packing in SFT for Qwen2 and LlaMA-3.1 models #344

Closed lostkevin closed 3 weeks ago