bytedance / byteps

A high performance and generic framework for distributed DNN training
Other
3.63k stars 490 forks source link

2worker more slow than 1 worker #360

Open qingfengmingyue opened 3 years ago

qingfengmingyue commented 3 years ago

image image 2worker and 2 server

jasperzhong commented 3 years ago

That's expected. This is because when there is only one worker there is no need to do any gradient exchanging.

qingfengmingyue commented 3 years ago

What kind of configuration is efficient? Can different workers have different numbers of GPUs?

jasperzhong commented 3 years ago

What kind of configuration is efficient? Can different workers have different numbers of GPUs?

  1. Generally, the number of servers should be equal to the number of working machine.
  2. No.