NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.14k stars 791 forks source link

Is there any benchmark of P2P communication between NCCL and UCX(ucp)? #1441

Open MoFHeka opened 1 week ago

MoFHeka commented 1 week ago

Including PCI-E, RDMA, TCP/IP and other scenarios, I do not know what kind of test is appropriate.

AddyLaddy commented 1 week ago

NCCL does not use UCX by default. But I'm not sure what you're asking about.

There is a suite of NCCL benchmarks at https://github.com/NVIDIA/nccl-tests

MoFHeka commented 1 week ago

I mean, what is the difference between the default P2P communication efficiency of NCCL and UCX, and in what scenarios will it be better?