Open liaopeiyuan opened 3 years ago
If I write my own multi-GPU model or use torch.distributed.pipeline.sync.Pipe, would multi-node training still work with byteps?
torch.distributed.pipeline.sync.Pipe
We are working on supporting model parallelism. For now, you can still use BytePS to optimize the allreduce primitive in your code.
allreduce
If I write my own multi-GPU model or use
torch.distributed.pipeline.sync.Pipe
, would multi-node training still work with byteps?