NCCL TL - Githubissues

Sergei-Lebedev commented 3 years ago

In PR #84 we are adding support for NCCL TL. If UCC was built with NCCL support TL NCCL might be selected by CLs for CUDA collectives i.e. when both source and destination buffers are of memory type CUDA. However there are some known limitations when NCCL is used such as launching multiple collectives on different streams concurrently. Therefore users are encouraged to follow NCCL guidelines to avoid potential deadlocks. From UCC perspective it means that if multiple teams are created and NCCL TL is used then user should not post CUDA collectives to different teams at the same time.

zfy3000163 commented 1 month ago

Does it support nccl, IB and TCP networks at the same time? Thanks very much!

Sergei-Lebedev commented 1 month ago

Does it support nccl, IB and TCP networks at the same time? Thanks very much!

Yes, all these transports are supported

openucx / ucc

NCCL TL #106