pytorch / torchtitan

A native PyTorch Library for large model training
BSD 3-Clause "New" or "Revised" License
2.25k stars 165 forks source link

Break down parallelize_llama for inference cases #402

Closed kwen2501 closed 3 months ago

kwen2501 commented 3 months ago

Stack from ghstack (oldest at bottom):

Breaking up parallelize_llama into:

This is for functionality reuse in inference cases, because one would not need activation checkpointing or DP there.

Can also improve code modularity and readability.