Hello! In my parallelism strategy, for example with 4 nodes, 2GPUs per node. I hope to create a comm_0 for all 8 GPUs and comm_1 for 4 GPUs on node_0 and node_1 and comm_2 for the 4 other GPUs on node_2 and node_3.
If in my design, no collectives are intended to be concurrent for communicators consists of same GPU (Here, collectives on comm_0 and comm_1 not concurrent, but collectives on comm_1 and comm_2 may be concurrent) , is it a safe use of NCCL?
Also, does each communicator has its own Ring/Tree channels and need its own rank identifier from 0 to (nRanks_in_the_communicator - 1). Thanks a lot!
Hello! In my parallelism strategy, for example with 4 nodes, 2GPUs per node. I hope to create a comm_0 for all 8 GPUs and comm_1 for 4 GPUs on node_0 and node_1 and comm_2 for the 4 other GPUs on node_2 and node_3.
If in my design, no collectives are intended to be concurrent for communicators consists of same GPU (Here, collectives on comm_0 and comm_1 not concurrent, but collectives on comm_1 and comm_2 may be concurrent) , is it a safe use of NCCL?
Also, does each communicator has its own Ring/Tree channels and need its own rank identifier from 0 to (nRanks_in_the_communicator - 1). Thanks a lot!