NVIDIA / cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Other
6.47k stars 1.83k forks source link

P2P is supported between GPUs that belong to different NUMA nodes #284

Open themoonstone opened 4 months ago

themoonstone commented 4 months ago

I've execute the simpleP2P on my server (GPU-L4, with the following topology) which GPU0 and GPU1 belongs to different numa nodes.

    GPU0    GPU1    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X  SYS 0-7,32-39   0       N/A
GPU1    SYS  X  16-23,48-55 2       N/A

And I obtained the following results. Why is that? How does it support P2P (Peer-to-Peer) connections between GPUs that are on different PCIe root complexes?

./simpleP2P 
[./simpleP2P] - Starting...
Checking for multiple GPUs...
CUDA-capable device count: 2

Checking GPU(s) for support of peer to peer memory access...
> Peer access from NVIDIA L4 (GPU0) -> NVIDIA L4 (GPU1) : Yes
> Peer access from NVIDIA L4 (GPU1) -> NVIDIA L4 (GPU0) : Yes
Enabling peer access between GPU0 and GPU1...
Allocating buffers (64MB on GPU0, GPU1 and CPU Host)...
Creating event handles...
cudaMemcpyPeer / cudaMemcpy between GPU0 and GPU1: 19.87GB/s
Preparing host buffer and memcpy to GPU0...
Run kernel on GPU1, taking source data from GPU0 and writing to GPU1...
Run kernel on GPU0, taking source data from GPU1 and writing to GPU0...
Copy data back to host from GPU0 and verify results...
Disabling peer access...
Shutting down...
Test passed