Thanks for the excellent work.
I have some issues when building MSCCL in my own environment.
I am currently using an ubuntu 18.04 machine with GPUs connected with PCIe, not NVLINK.
I tried to build msccl on two machines: the first one has 2 x V100 32GB, and the second one has 2 x A5000 GPUs.
Both of them are compiled with CUDA 11.1, and they are set as the default cuda path.
However, when I tried to build MSCCL following the guideline of the official repo, my script got freeze with lots of warnings, and it fails. (I tried to build via source & cloning the git repository, and neither of them has succeeded.)
I tried to solve it by referencing the previous build error issues, but it seems to be not working with my situation.
Also, I am wondering if the MSCCL is compatible only with the system with NVLINK, but not sure about it.
I've attached some error logs (the errors I got when building via source zip file & cloning the git repo).
Can I get some advice on my issue?
Hello MSCCL team,
Thanks for the excellent work. I have some issues when building MSCCL in my own environment.
I am currently using an ubuntu 18.04 machine with GPUs connected with PCIe, not NVLINK. I tried to build msccl on two machines: the first one has 2 x V100 32GB, and the second one has 2 x A5000 GPUs. Both of them are compiled with CUDA 11.1, and they are set as the default cuda path.
However, when I tried to build MSCCL following the guideline of the official repo, my script got freeze with lots of warnings, and it fails. (I tried to build via source & cloning the git repository, and neither of them has succeeded.)
I tried to solve it by referencing the previous build error issues, but it seems to be not working with my situation. Also, I am wondering if the MSCCL is compatible only with the system with NVLINK, but not sure about it.
I've attached some error logs (the errors I got when building via source zip file & cloning the git repo). Can I get some advice on my issue?
error_msccl_git_build.log error_msccl_source_build.log