pytorch / torchtitan

A native PyTorch Library for large model training
BSD 3-Clause "New" or "Revised" License
1.28k stars 115 forks source link

Fix start/stop layer parsing #378

Closed wconstab closed 1 month ago

wconstab commented 1 month ago

Stack from ghstack (oldest at bottom):

kwen2501 commented 1 month ago

nit: maybe try not to stack PR's that have no dependency? For example, perhaps someone (like me) would like to land this PR now, but he/she is not sure if the base PR needs to be landed first.