Open arnavdixit opened 1 year ago
Set those env variables: https://github.com/NVIDIA/FasterTransformer/blob/main/cmake/Modules/FindNCCL.cmake#L93
In my case, they were:
$ export NCCL_INCLUDE_DIR=/usr/local/nccl2/include
$ export NCCL_LIB_DIR=/usr/local/nccl2/lib
$ export NCCL_VERSION=2
While trying to build with PyTorch, I am getting a CMake error.
CMake Error at /opt/conda/envs/fastertransformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.26/Modules/FindPackageHandleStandardArgs.cmake:230 (message): Could NOT find NCCL (missing: NCCL_INCLUDE_DIRS NCCL_LIBRARIES) Call Stack (most recent call first): /opt/conda/envs/fastertransformer/lib/python3.8/site-packages/cmake/data/share/cmake-3.26/Modules/FindPackageHandleStandardArgs.cmake:600 (_FPHSA_FAILURE_MESSAGE) cmake/Modules/FindNCCL.cmake:126 (find_package_handle_standard_args) CMakeLists.txt:84 (find_package)
I checked and I do have NCCL installed. Here is what i queries and the output for the same:
python -c "import torch;print(torch.cuda.nccl.version())"
Output:(2, 14, 3)
Not really sure what the issue is