Open TianheLu opened 11 months ago
Perhaps you at least need to upgrade your cuda toolkit to 11.0
Thanks for helping. I use "nvcc -V" and the result is: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Mon_Oct_24_19:12:58_PDT_2022 Cuda compilation tools, release 12.0, V12.0.76 Build cuda_12.0.r12.0/compiler.31968024_0
And I use "nvidia-smi", the result is: NVIDIA-SMI 525.60.13 Driver Version: 525.60.13 CUDA Version: 12.0
It seems that my CUDA version is 12.0. I still don't know why this error happens.
Thank you in advance.
what is your compilation command?
Can you show me echo $CUDA_HOME
?
what is your compilation command?
Can you show me
echo $CUDA_HOME
?
The result of echo $CUDA_HOME
is:
:/usr/local/cuda
I use 'make -j src.build' to make.
Just want to confirm your CUDA_HOME is /usr/local/cuda
instead of :/usr/local/cuda
?
And then, can you try make -j src.build CUDA_HOME=$CUDA_HOME
?
I tried to change my CUDA_HOME from :/usr/local/cuda to /usr/local/cuda.
And also tried make -j src.build CUDA_HOME=$CUDA_HOME.
But I met the same error.
Just want to confirm your CUDA_HOME is
/usr/local/cuda
instead of:/usr/local/cuda
?And then, can you try
make -j src.build CUDA_HOME=$CUDA_HOME
?
Looks weird. Could you please share your env variable PATH
and output of $CUDA_HOME/bin/nvcc -V
? and do you have any env variable that sets up the include path such as C_INCLUDE_PATH
or CXX_INCLUDE_PATH
The PATH is: /opt/anaconda3/bin:/opt/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda/bin
The result of $CUDA_HOME/bin/nvcc -V is: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Mon_Oct_24_19:12:58_PDT_2022 Cuda compilation tools, release 12.0, V12.0.76 Build cuda_12.0.r12.0/compiler.31968024_0
And I have no include path such as C_INCLUDE_PATH or CXX_INCLUDE_PATH.
I also think it's weird. Thanks.
Looks weird. Could you please share your env variable
PATH
and output of$CUDA_HOME/bin/nvcc -V
? and do you have any env variable that sets up the include path such asC_INCLUDE_PATH
orCXX_INCLUDE_PATH
If so, could you please upgrade your cuda to 12.1? I am afraid your previous cuda installation has a problem.
If so, could you please upgrade your cuda to 12.1? I am afraid your previous cuda installation has a problem.
Yeah, I will have a try. Thank you very much.
Hello, I tried to make the nccl, but I met error as follows: transport/p2p.cc: In function ‘ncclResult_t ncclP2pFreeShareableBuffer(ncclIpcDesc*)’: transport/p2p.cc:220:5: error: ‘CUmemAllocationHandleType’ was not declared in this scope 220 | CUmemAllocationHandleType type = NCCL_P2P_HANDLE_TYPE; | ^
~~~~~~~~ transport/p2p.cc:222:9: error: ‘type’ was not declared in this scope 222 | if (type == CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR) { | ^~~~ make[2]: Entering directory '/root/nccl/nccl/src/collectives/device' transport/p2p.cc:222:17: error: ‘CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR’ was not declared in this scope 222 | if (type == CU_MEM_HANDLE_TYPE_POSIX_FILE_DESCRIPTOR) { | ^~~~~~~~~~~~ make[1]: *** [Makefile:119: /root/nccl/nccl/build/obj/transport/p2p.o] Error 1I don't know why it happens, and I believe there will be many errors like this. Thanks so much.