NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.13k stars 788 forks source link

Single or double ring #1360

Open CatalinLucian opened 2 months ago

CatalinLucian commented 2 months ago

Hello,

After few experiments it seems that NCCL uses a double ring topology for data transfer. Is double ring the default? Or is there an option to change to single ring topology? I am investigating different topology configurations and data transfer orders.

Regards, Catalin

sjeaugey commented 2 months ago

The number of rings NCCL uses depends on your hardware topology, and how many rings it needs to reach peak bandwidth. Each ring is run by a GPU SM and can use a different path within the node to maximize the usage of HW.