NVIDIA / nccl

Optimized primitives for collective multi-GPU communication
Other
3.11k stars 787 forks source link

git branch version is broken for the latest release. #684

Open amrragab8080 opened 2 years ago

amrragab8080 commented 2 years ago

Doing a git clone on a specific branch version it shows i switched to the branch version here

git clone https://github.com/NVIDIA/nccl /opt/nccl     && cd /opt/nccl     && git checkout -b v2.11.4-1     && make -j src.build CUDA_HOME=/usr/local/cuda     NVCC_GENCODE="-gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_60,code=sm_60"
 ---> Running in 8c76f818b8db
Cloning into '/opt/nccl'...
Switched to a new branch 'v2.11.4-1'

However later in the compilation it says it compiled 2.12.12

make[2]: Leaving directory '/opt/nccl/src/collectives/device'
Linking    libnccl.so.2.12.12                  > /opt/nccl/build/lib/libnccl.so.2.12.12
Archiving  libnccl_static.a                    > /opt/nccl/build/lib/libnccl_static.a
AddyLaddy commented 2 years ago

I think the command should be: git checkout tags/v2.11.4-1 -b 2.11.4 You can always check the version.mk file to be sure:

cat makefiles/version.mk 
##### version
NCCL_MAJOR   := 2
NCCL_MINOR   := 11
NCCL_PATCH   := 4
NCCL_SUFFIX  :=
PKG_REVISION := 1
amrragab8080 commented 2 years ago

Great thanks. It's working now, and additional checks are in place now for my MLOps pipeline.