NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.24k stars 815 forks source link

A Question about network buffer #1450

Closed ZhiyiHu1999 closed 1 week ago

ZhiyiHu1999 commented 1 month ago

Hello! I have a question about the network buffer between GPU and proxy thread: Is the buffer (which is by default 4MiB for Simple protocol) shared by all channels or each channel has its own network buffer. I think each channel has its own network buffer, is it true. Thanks for your reply!

sjeaugey commented 1 month ago

Yes, in general, each channel has its own buffer for collective operations (ring/tree/etc). For send/recv operations however, we have a single shared buffer.

ZhiyiHu1999 commented 1 month ago

Thanks a lot!

ZhiyiHu1999 commented 1 month ago

Yes, in general, each channel has its own buffer for collective operations (ring/tree/etc). For send/recv operations however, we have a single shared buffer.

Hello! @sjeaugey I want to add a questiion: If a node (suppose only one GPU with it) in a Tree channel has two child and one parent, does the send to these three peers share one network buffer or each send to each peer has its own buffer (in this case three buffer). Thanks a lot!