Closed kashif closed 6 years ago
It looks making sense. When CuPy was split from Chainer, CUDA related things including cuDNN support were included in CuPy side even though they were not for ndarray that CuPy is essentially responsible to.
Yes so the only place I found where cupy's nccl bindings are being used is in chainer's mult-gpu trainer example, but again correct me if I am wrong, Chainermn is the way to do that, so perhaps it might make sense to think about what belongs where moving forward.
I haven't used the nccl
bindings yet, but I do find it slightly nice to have all of the CUDA-related bindings here in one place. One of the things I like with cupy is that I can bypass or unbox the ndarray
and make direct calls to the CUDA or cudnn APIs, even doing this via Cython if I want to release the GIL.
Of course if the bindings complicate the build or cause other problems, then yes that's a good argument for removing them. But for simple tidiness I would slightly prefer they stayed.
ChainerMN now uses NCCL bindings in CuPy. https://github.com/chainer/chainermn/pull/181
Hello!
Would it make sense to remove the nccl bindings from Cupy? Its not being used in Cupy itself as far as I can see, and neither is chainermn using these bindings, there you have more or less the same bindings replicated.
Also nccl1 and nccl2 do not have any binaries for OSX so compiling these bindings also fails on this platform. One could manually install nccl1 from the source but as far as I can see, nccl2 is closed source now.