pytorch / torchtitan

A native PyTorch Library for large model training
BSD 3-Clause "New" or "Revised" License
2.68k stars 212 forks source link

necessary changes to unblock Sequence Parallel on odd length sequences #686

Open tianyu-l opened 1 week ago

tianyu-l commented 1 week ago

Stack from ghstack (oldest at bottom):