Is model parallelism supported for PyTorch?

bytedance / byteps

A high performance and generic framework for distributed DNN training

Other

3.59k stars 488 forks source link

Open liaopeiyuan opened 3 years ago

liaopeiyuan commented 3 years ago

If I write my own multi-GPU model or use torch.distributed.pipeline.sync.Pipe, would multi-node training still work with byteps?

ymjiang commented 3 years ago

We are working on supporting model parallelism. For now, you can still use BytePS to optimize the allreduce primitive in your code.