lucasjinreal / DCNv2_latest

DCNv2 supports decent pytorch such as torch 1.5+ (now 1.8+)
BSD 3-Clause "New" or "Revised" License
614 stars 125 forks source link

Still get "THCudaBlas_Sgemm is undefined" error with Pytorch1.8+Cuda10.2 on Jetson AGX #25

Closed ko440124 closed 3 years ago

ko440124 commented 3 years ago

Hi, I try to run the make.sh on Jetson AGX with Pytorch1.8+Cuda10.2 but get error like this.

`/usr/local/cuda/bin/nvcc -DWITH_CUDA -I/home/get/Downloads/DCNv2_latest-master/src -I/home/get/.local/lib/python3.6/site-packages/torch/include -I/home/get/.local/lib/python3.6/site-packages/torch/include/torch/csrc/api/include -I/home/get/.local/lib/python3.6/site-packages/torch/include/TH -I/home/get/.local/lib/python3.6/site-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.6m -c /home/get/Downloads/DCNv2_latest-master/src/cuda/dcn_v2_cuda.cu -o build/temp.linux-aarch64-3.6/home/get/Downloads/DCNv2_latest-master/src/cuda/dcn_v2_cuda.o -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr --compiler-options '-fPIC' -DCUDA_HAS_FP16=1 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -D__CUDA_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_72,code=compute_72 -gencode=arch=compute_72,code=sm_72 -ccbin g++ -std=c++14 /home/get/Downloads/DCNv2_latest-master/src/cuda/dcn_v2_cuda.cu(127): error: identifier "THCudaBlas_SgemmBatched" is undefined

/home/get/Downloads/DCNv2_latest-master/src/cuda/dcn_v2_cuda.cu(274): error: identifier "THCudaBlas_Sgemm" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00005223_00000000-6_dcn_v2_cuda.cpp1.ii". error: command '/usr/local/cuda/bin/nvcc' failed with exit status 1 `

Main errors shortcut,

/home/get/Downloads/DCNv2_latest-master/src/cuda/dcn_v2_cuda.cu(127): error: identifier "THCudaBlas_SgemmBatched" is undefined

/home/get/Downloads/DCNv2_latest-master/src/cuda/dcn_v2_cuda.cu(274): error: identifier "THCudaBlas_Sgemm" is undefined

lucasjinreal commented 3 years ago

Sorry, I can not reproduce. both pytorch 1.7 and 1.8 OK for me. image image

leiwen83 commented 3 years ago

nvcr.io/nvidia/pytorch:20.12-py3 this docker image could reproduce this error.

austinmw commented 3 years ago

@jinfagang I also get the same error with 1.8 recently, but works when downgrading to 1.7

McDifference commented 3 years ago

@jinfagang Same error with pytorch1.8

lucasjinreal commented 3 years ago

@austinmw @McDifference That is weired. I tried pytorch 1.8 and it worked on my side. You can see my 2 different computer output message. Maybe 1.8.1 or 1.9 might help?

Uio96 commented 3 years ago

@jinfagang Thanks a lot for your efforts. I have the same error with PyTorch 1.8.1 & cuda_11.1 on an RTX 3090.

tdhcuong commented 3 years ago

I also got the same problem! For the newest version of Pytorch (1.8.1+), they have moved away the THCBlas (https://github.com/pytorch/pytorch/pull/49725) so the THCudaBlas_SgemmBatched, THCudaBlas_Sgemm cannot use anymore! I try the cuda code from this repository and compiled it successfully with PyTorch 1.8.1 & cuda_11.1 on an RTX 3070.

lucasjinreal commented 3 years ago

I have updated and removed THCudaBlas_SgemmBatched

AliRidaBahja commented 1 year ago

I have updated and removed THCudaBlas_SgemmBatched

Hallo, Can you share how you did that?