alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
723 stars 103 forks source link

Fix broadcast CPU tensor issue when TP>1 PP=1 in jsonl dataset SFT #377

Closed lostkevin closed 1 week ago