NVIDIA / gdrcopy

A fast GPU memory copy library based on NVIDIA GPUDirect RDMA technology
MIT License
898 stars 144 forks source link

How should I implement device-to-device memory copying between different hosts? For example, copying the contents from host B's GPU memory to host A's CPU memory. #300

Closed JOjoker-world closed 4 months ago

pakmarkthub commented 4 months ago

Hi @JOjoker-world,

This is internode communication. You can use a communication library such as MPI, NCCL, NVSHMEM, etc depending on the programming model your application is using.

JOjoker-world commented 4 months ago

Thank you very much for your reply. Do you mean that gdrcopy can achieve direct copying between GPU and CPU across different nodes (e.g., A.GPU -> B.CPU) by using the appropriate communication library?

pakmarkthub commented 4 months ago

No. GDRCopy is an intranode communication library between GPU and CPU. It cannot do internode communication. On the other hand, MPI, NCCL, NVSHMEM are designed to support internode communication. You will want to use them.

JOjoker-world commented 4 months ago

Alright, I understand. Thank you for your reply.