bytedance / byteps

A high performance and generic framework for distributed DNN training
Other
3.63k stars 490 forks source link

How to run communication scheduling with BytePS #348

Open Rivendile opened 3 years ago

Rivendile commented 3 years ago

Hello, I would like to run communication scheduling with BytePS. What parameters should I set? Is setting BYTEPS_SERVER_ENABLE_SCHEDULE=1, BYTEPS_SCHEDULING_CREDIT and BYTEPS_PARTITION_BYTES enough? What's the difference between server schedule and bytescheduler? Any help would be appreciated. Thanks a lot.

ymjiang commented 3 years ago

The three parameters are enough. BYTEPS_SERVER_ENABLE_SCHEDULE is for the server process, but it is enabled by default though. The other two are similar to those used in ByteScheduler.

What's the difference between server schedule and bytescheduler?

ByteScheduler does not support scheduling at the server side.

Rivendile commented 3 years ago

Thanks to your timely reply. Does byteps have scheduling at the worker side by default? Is there any parameter to control this feature?

ymjiang commented 3 years ago

Does byteps have scheduling at the worker side by default? Is there any parameter to control this feature?

Yes, BYTEPS_SCHEDULING_CREDIT controls the scheduling at the worker side.

Rivendile commented 3 years ago

Thanks a lot. Is the priority used for worker-side scheduling the index of the layers? What is the priority used for server-side scheduling?

ymjiang commented 3 years ago

For both sides, the priority is determined by the tensor index.

You can refer to here if you are interested in how it works: https://github.com/bytedance/byteps/blob/master/byteps/server/server.cc#L473

Rivendile commented 3 years ago

Hi, @ymjiang ,I tried to set these parameters to control scheduling behavior, which is successful while using MxNet BytePS. However, when I set byteps_server_enable_schedule=1 and byteps_scheduling_credit=4, the timeline looks like this: image It seems that there is no scheduling at all. Any suggestions?

ymjiang commented 3 years ago

It seems that there is no scheduling at all. Any suggestions?

Actually I am not sure how you get this conclusion from this figure.. Enabling the scheduling only means that former layers are preferentially selected for communication, but this is not guaranteed.

Rivendile commented 3 years ago

I'm a little confused. First of all, in MxNet, when I use these parameters, the former layers are selected for communication, and the latter ones that are not sent while the former one comes will be delayed. image As shown above, in MxNet, the tensor of gradient 28 wait for former layers. But in the figure https://github.com/bytedance/byteps/issues/348#issuecomment-755952360, the tensors of features 28, 26, 24, 17 show the similar pattern. They all have pull operations that are done early or late. On the other hand, don't the credit size and priority queue used in https://github.com/bytedance/byteps/blob/249006c9105d7b4fd09962eb133c3e76de1c8656/byteps/common/scheduled_queue.cc guarantee the preference, at least in the range of the credit size? Because the GetTask function gets the ready tensor with the highest priority.

Rivendile commented 3 years ago

I think there is a problem with the for loop in GetTask function in https://github.com/bytedance/byteps/blob/249006c9105d7b4fd09962eb133c3e76de1c8656/byteps/common/scheduled_queue.cc. For example, when a large tensor with high priority is not chosen because of credit limitation, some small tensors with lower priority may be chosen. Thus the priority is not guaranteed. So why does not BytePS use true priority queue used in ByteScheduler (https://github.com/bytedance/byteps/blob/33fe89f5a6a691ec562ad2b0167f0192fd8ced7d/bytescheduler/bytescheduler/common/bytecore.py#L213) to get rid of this problem?

ymjiang commented 3 years ago

For example, when a large tensor with high priority is not chosen because of credit limitation, some small tensors with lower priority may be chosen. Thus the priority is not guaranteed.

We partition the large tensors to equal size to avoid this. (small tensors do not matter much according to our tests)

Rivendile commented 3 years ago

Thanks to your explanation! Could you please tell me what else makes the priority unguaranteed?

lucasleesw commented 3 years ago

Does byteps have scheduling at the worker side by default? Is there any parameter to control this feature?

Yes, BYTEPS_SCHEDULING_CREDIT controls the scheduling at the worker side.

Hi, since scheduling at the worker side is the default manner, could you share the differences between import byteps.torch as bps and import byteps.torch.cross_barrier as bps ?