Implemented flexible PP - Githubissues

pytorch / PiPPy

Pipeline Parallelism for PyTorch

BSD 3-Clause "New" or "Revised" License

726 stars 86 forks source link

Implemented flexible PP #1129

Open haocizhang opened 4 months ago

haocizhang commented 4 months ago

Enabled some cases to work where num_microbatches % pp_size != 0. Using the flex_pp schedule, we will have

num_rounds = max(1, n_microbatches // pp_group_size) and it works as long as n_microbatches % num_rounds is 0. As a few examples, support

pp_group_size = 4, n_microbatches = 10. We will have num_rounds = 2 and n_microbatches % 2 is 0.
pp_group_size = 4, n_microbatches = 3. We will have num_rounds = 1 and n_microbatches % 1 is 0.

Tested using the config in (1), schedule looks like the following graph:

vivien-chu commented 4 months ago

n00b question, how do we assign received tensors to corresponding model chunk?