NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.19k stars 804 forks source link

NCCL with Python code (question) #253

Closed abidmalikwaterloo closed 5 years ago

abidmalikwaterloo commented 5 years ago

Can we use NCCL with python code? Thanks

keisukefukuda commented 5 years ago

The answer is Yes and No.

If you use some deep learning frameworks, NCCL is supported by most major frameworks. So the answer is Yes. ex. TF+Horovod, PyTorch+Horovod.

NCCL is a C library and Python can call C functions, so the answer is yes in that sense. However, NCCL is for NVIDIA GPUs, so you need to allocate GPU device memory & pass memory pointers to NCCL. In bare Python programs, this is not easy. In that sense, No.

Many deep learning frameworks have support libraries, written in C, to bridge between Python and NCCL. I guess Horovod is the most major one. If you are NOT using deep learning frameworks and want to use only NCCL, I guess CuPy(https://cupy.chainer.org/) is an option. CuPy is a Python library for CUDA and supports NCCL. Here is an example of using NCCL through CuPy. https://github.com/chainer/chainer/blob/master/chainermn/communicators/pure_nccl_communicator.py#L166

abidmalikwaterloo commented 5 years ago

@keisukefukuda Thank you for the detailed guidance. It is really helpful.