SHI-Labs / Neighborhood-Attention-Transformer

Neighborhood Attention Transformer, arxiv 2022 / CVPR 2023. Dilated Neighborhood Attention Transformer, arxiv 2022
MIT License
1.05k stars 86 forks source link

CUDA extension error #33

Closed wytcsuch closed 2 years ago

wytcsuch commented 2 years ago

Thank you for your good job, however there is an erro when I build CUDA extension. torch = 1.11.0 python = 3.7 cuda = 10.1

Traceback (most recent call last):
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1746, in _run_ninja_build
    env=env)
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/nattencuda.py", line 20, in <module>
    'nattenav_cuda', [f'{this_dir}/src/nattenav_cuda.cpp', f'{this_dir}/src/nattenav_cuda_kernel.cu'], verbose=False)
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1156, in load
    keep_intermediates=keep_intermediates)
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1367, in _jit_compile
    is_standalone=is_standalone)
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1472, in _write_ninja_file_and_build_library
    error_prefix=f"Error building extension '{name}'")
  File "/home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/utils/cpp_extension.py", line 1756, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'nattenav_cuda': [1/3] /home/yckj3822/anaconda3/envs/unsup3d/bin/x86_64-conda_cos6-linux-gnu-c++ -MMD -MF nattenav_cuda.o.d -DTORCH_EXTENSION_NAME=nattenav_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/TH -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/src/nattenav_cuda.cpp -o nattenav_cuda.o
[2/3] /usr/local/cuda-10.1/bin/nvcc  -ccbin /home/yckj3822/anaconda3/envs/unsup3d/bin/x86_64-conda_cos6-linux-gnu-cc -DTORCH_EXTENSION_NAME=nattenav_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/TH -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/src/nattenav_cuda_kernel.cu -o nattenav_cuda_kernel.cuda.o
FAILED: nattenav_cuda_kernel.cuda.o
/usr/local/cuda-10.1/bin/nvcc  -ccbin /home/yckj3822/anaconda3/envs/unsup3d/bin/x86_64-conda_cos6-linux-gnu-cc -DTORCH_EXTENSION_NAME=nattenav_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/TH -isystem /home/yckj3822/anaconda3/envs/unsup3d/lib/python3.7/site-packages/torch/include/THC -isystem /usr/local/cuda-10.1/include -isystem /home/yckj3822/anaconda3/envs/unsup3d/include/python3.7m -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -std=c++14 -c /home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/src/nattenav_cuda_kernel.cu -o nattenav_cuda_kernel.cuda.o
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc: In instantiation of 'static std::basic_string<_CharT, _Traits, _Alloc>::_Rep* std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_S_create(std::basic_string<_CharT, _Traits, _Alloc>::size_type, std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Alloc&) [with _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>; std::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]':
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:578:28:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct(_InIterator, _InIterator, const _Alloc&, std::forward_iterator_tag) [with _FwdIterator = const char16_t*; _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:5033:20:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct_aux(_InIterator, _InIterator, const _Alloc&, std::__false_type) [with _InIterator = const char16_t*; _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:5054:24:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct(_InIterator, _InIterator, const _Alloc&) [with _InIterator = const char16_t*; _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:656:134:   required from 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Alloc&) [with _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>; std::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:6676:95:   required from here
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:1067:16: error: cannot call member function 'void std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_M_set_sharable() [with _CharT = char16_t; _Traits = std::char_traits<char16_t>; _Alloc = std::allocator<char16_t>]' without object
       __p->_M_set_sharable();
       ~~~~~~~~~^~
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc: In instantiation of 'static std::basic_string<_CharT, _Traits, _Alloc>::_Rep* std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_S_create(std::basic_string<_CharT, _Traits, _Alloc>::size_type, std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Alloc&) [with _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>; std::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]':
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:578:28:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct(_InIterator, _InIterator, const _Alloc&, std::forward_iterator_tag) [with _FwdIterator = const char32_t*; _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:5033:20:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct_aux(_InIterator, _InIterator, const _Alloc&, std::__false_type) [with _InIterator = const char32_t*; _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:5054:24:   required from 'static _CharT* std::basic_string<_CharT, _Traits, _Alloc>::_S_construct(_InIterator, _InIterator, const _Alloc&) [with _InIterator = const char32_t*; _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:656:134:   required from 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const _CharT*, std::basic_string<_CharT, _Traits, _Alloc>::size_type, const _Alloc&) [with _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>; std::basic_string<_CharT, _Traits, _Alloc>::size_type = long unsigned int]'
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.h:6681:95:   required from here
/home/yckj3822/anaconda3/envs/unsup3d/x86_64-conda_cos6-linux-gnu/include/c++/7.3.0/bits/basic_string.tcc:1067:16: error: cannot call member function 'void std::basic_string<_CharT, _Traits, _Alloc>::_Rep::_M_set_sharable() [with _CharT = char32_t; _Traits = std::char_traits<char32_t>; _Alloc = std::allocator<char32_t>]' without object
ninja: build stopped: subcommand failed.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/nattencuda.py", line 27, in <module>
    import nattenav_cuda
ModuleNotFoundError: No module named 'nattenav_cuda'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "natten/gradcheck.py", line 11, in <module>
    from nattencuda import NATTENAVFunction, NATTENQKRPBFunction
  File "/home/yckj3822/GAN/Neighborhood-Attention-Transformer-main/natten/nattencuda.py", line 30, in <module>
    raise RuntimeError("Could not load NATTEN CUDA extension. " +
RuntimeError: Could not load NATTEN CUDA extension. Please make sure your device has CUDA, the CUDA toolkit for PyTorch is installed, and that you've compiled NATTEN correctly.
alihassanijr commented 2 years ago

Hello and thank you for your interest.

This is strange, it is failing to compile but I don't see exactly what part of the extension is causing that. I believe we have tried both Python 3.7 and 3.8 with torch 1.11 and in both cases NATTEN compiles. You did state that you're on CUDAv10.1, and as fas as I'm aware the latest torch version with the same version CUDA toolkit was 1.8.1, so this might be the root cause. Can you confirm the version of your torch and the cuda toolkit with it? That would help us try to reproduce the issue on our end so we can debug it. As far as I'm seeing torch 1.11 appears to have been built for three toolkits only: v10.2, 11.3, and 11.5. Just want to confirm which one you're on.

To get those, you can simply run:

python3 -c "import torch; print(torch.__version__); print(torch._C._cuda_getCompiledVersion())"

and this to get the actual CUDA driver version:

nvcc --version

I can confirm that I tried a docker image with CUDA v10.1, installed pytorch, and all the other requirements directly using the requirements.txt file, and it built successfully.

wytcsuch commented 2 years ago

python3 -c "import torch; print(torch.version); print(torch._C._cuda_getCompiledVersion())"

Thank you very much for your reply! 微信截图_20220514110304 微信截图_20220514110601

And I have changed the torch version to 1.8.0, and this error also happened.This is indeed a very strange erro

alihassanijr commented 2 years ago

I would recommend staying on torch 1.11, since we've found it to yield the best performance. As far as this issue goes, I believe it's been a know issue that PyTorch users have had when using ninja on CUDA v 10.1.105, which is what you happen to be on. Can you try some of the fixes reported in PyTorch issue # 1893, specifically this suggestion?

wytcsuch commented 2 years ago

I would recommend staying on torch 1.11, since we've found it to yield the best performance. As far as this issue goes, I believe it's been a know issue that PyTorch users have had when using ninja on CUDA v 10.1.105, which is what you happen to be on. Can you try some of the fixes reported in PyTorch issue # 1893, specifically this suggestion?

Hi, I change CUDA to version 10.2, which seems to solve the problem