Open SeekPoint opened 8 months ago
suggest change the command as below CXX=/usr/local/cuda-11.7/bin/nvcc BIN_HOME=$HOME/amd00/yk_repo/NCCL/nccl/build/obj/ SRC_HOME=$HOME/amd00/yk_repo/NCCL/nccl/src make
Hello, I hava the same question, and this is my msccl-schduler path: root@docker-desktop:/home/msccl-tool/msccl/scheduler/msccl-scheduler# and my nccl path is: /home/nccl-tool/nccl/build/obj
I try those command: root@docker-desktop:/home/msccl-tool/msccl/scheduler/msccl-scheduler# CXX=/usr/local/cuda/bin/nvcc BIN_HOME=/home/nccl-tool/nccl/build/obj/ SRC_HOME=/home/nccl-tool/nccl/src/ make and "root@docker-desktop:/home/msccl-tool/msccl/scheduler/msccl-scheduler# CXX=/usr/local/cuda/bin/nvcc BIN_HOME=$HOME/nccl-tool/nccl/build/obj/ SRC_HOME=$HOM E/nccl-tool/nccl/src/ make".
but both them have the problems :
Compiling & Linking libmsccl-scheduler.so.1.0.0 > /home/msccl-tool/msccl/scheduler/msccl-scheduler/build/lib/libmsccl-scheduler.so.1.0.0
Compiling & Linking src/scheduler.cc > src/parser.cc
mkdir -p /home/msccl-tool/msccl/scheduler/msccl-scheduler/build/lib
/usr/local/cuda-11.7/bin/nvcc -I/root/nccl-tool/nccl/build/obj//include -I/root/nccl-tool/nccl/src//src/include --compiler-options -fPIC,-shared,-DNCCL -o /home/msccl-tool/msccl/scheduler/msccl-scheduler/build/lib/libmsccl-scheduler.so.1.0.0 --linker-options -soname,libmsccl-scheduler.so.1 -lcurl src/scheduler.cc src/parser.cc
In file included from src/scheduler.cc:21:
src/parser.h:21:10: fatal error: msccl/msccl_scheduler.h: No such file or directory
21 | #include "msccl/msccl_scheduler.h"
| ^~~~~~~~~
compilation terminated.
make: *** [Makefile:43: /home/msccl-tool/msccl/scheduler/msccl-scheduler/build/lib/libmsccl-scheduler.so.1.0.0] Error 1.
So I executed the following command: root@docker-desktop:/home/msccl-tool# find -name msccl_scheduler.h shows ./msccl/executor/msccl-executor-nccl/src/include/msccl/msccl_scheduler.h
I am confused that there is no way to link /home/msccl-tool/msccl/executor/msccl-executor-nccl/src/include/msccl/msccl_scheduler.h file.
amd00@MZ32-00:~/yk_repo/NCCL/msccl-az$ git remote -v origin https://github.com/Azure/msccl.git (fetch)
amd00@MZ32-00:~/yk_repo/NCCL/msccl-az/scheduler/msccl-scheduler$ CXX=/usr/local/cuda-11.7/bin/nvcc BIN_HOME=/home/amd00/yk_repo/NCCL/nccl/build/obj/ SRC_HOME=/home/amd00/yk_repo/NCCL/nccl/src make
Compiling & Linking libmsccl-scheduler.so.1.0.0 > /home/amd00/yk_repo/NCCL/msccl-az/scheduler/msccl-scheduler/build/lib/libmsccl-scheduler.so.1.0.0 Compiling & Linking src/scheduler.cc > src/parser.cc mkdir -p /home/amd00/yk_repo/NCCL/msccl-az/scheduler/msccl-scheduler/build/lib /usr/local/cuda-11.7/bin/nvcc -I/home/amd00/yk_repo/NCCL/nccl/build/obj//include -I/home/amd00/yk_repo/NCCL/nccl/src/src/include --compiler-options -fPIC,-shared,-DNCCL -o /home/amd00/yk_repo/NCCL/msccl-az/scheduler/msccl-scheduler/build/lib/libmsccl-scheduler.so.1.0.0 --linker-options -soname,libmsccl-scheduler.so.1 -lcurl src/scheduler.cc src/parser.cc In file included from src/scheduler.cc:21: src/parser.h:21:10: fatal error: msccl/msccl_scheduler.h: No such file or directory 21 | #include "msccl/msccl_scheduler.h" | ^
~~~~~~~~ compilation terminated. make: *** [Makefile:43: /home/amd00/yk_repo/NCCL/msccl-az/scheduler/msccl-scheduler/build/lib/libmsccl-scheduler.so.1.0.0] Error 1 amd00@MZ32-00:~/yk_repo/NCCL/msccl-az/scheduler/msccl-scheduler$