Closed YZP17121579 closed 6 months ago
That's because GeForce cards don't support GPU Direct P2P (direct PCI-to-PCI communication). Therefore, the traffic cannot stay local to the PCI switch and has to go back to the CPU, causing a 2x increase in load on the PCI link to the CPU compared to the case where the 2 GPUs are on different sockets.
As the graph above shows, the topo between GPU0 and GPU3 is SYS, and PIX between GPU2 and GPU3. I'm wondering why the bandwidth between GPU2 and GPU3 is much slower than the other?