Open ZhiyiHu1999 opened 15 hours ago
I believe Geforce cards are not P2P-capable. Now, it may not be a huge deal, if you only have 2 GPUs per node and they're not connected through a PCI switch but directly to the CPU. In that case, going through memory can give better performance than P2P.
Hello! I am doing all-to-all communication using ncclSend() and ncclReccv() between 4 GPUs on two nodes, with 2 GPUs per node. However, it seems that GPUs on the same node cannot do this P2P communication and here is the debug info. Could you help with telling me why this is the case. Thanks a lot!