Understanding Data Propagation and Communication in Model Parallelism

In the context of model parallelism, each layer's input is derived from the output of the previous layer. Could someone please explain how the output from one layer is passed to the next during training? Specifically, how does each process communicate with the others in the model parallel group after completing its respective training task? How does a layer, once it completes its computations, pass the results to the next layer? Which interface or function is responsible for transferring the output to the subsequent layer?

Specifically, referring to the model parallelism diagram, how do processes 3 and 4 obtain the computation results from processes 1 and 2 respectively? After 3 and 4 complete their individual computations, how do they communicate with each other? Furthermore, after this communication, which interface is used to pass the results to processes 5 and 6?

![Uploading v2-708c01105de92567824bd9d3456b9459_720w.png…]()

I appreciate any clarification or references to relevant documentation that could help me understand these processes better.

HPDL-Group / Merak

Understanding Data Propagation and Communication in Model Parallelism #9