NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.16k stars 799 forks source link

How to enable nvlink P2P across container in the same node? #945

Open autumn0207 opened 1 year ago

autumn0207 commented 1 year ago

Has anyone meet the same problem? only SHM can work across container in same node

KaimingOuyang commented 1 year ago

I don't think you can do that. Since each container is a separate "node", cuda driver cannot detect any chances for P2P.

autumn0207 commented 1 year ago

@KaimingOuyang maybe we can hijack cuda api in container and response the result in host

sjeaugey commented 1 year ago

We're not aware of any such possibility.