AFDWang / Hetu-Galvatron

Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you have any interests, please visit/star/fork https://github.com/PKU-DAIR/Hetu-Galvatron
11 stars 4 forks source link

adapt PyTorch > 2.0 & grad accumulate #1

Closed Fizzmy closed 5 months ago