Closed mra-h closed 1 year ago
Is this all error message?
I managed to make it work. I had a custom cuda installed inside my conda envrionment so I had to export CUDA_HOME to point to the correct location
Hi @pengsida and @mra-h , I occur the same error, and log is listed below:
(mlp_maps) yuze@yuze-desktop:~/Documents/project/mlp_maps$ python train_net.py --config configs/nhr/sport1.py
Using /home/yuze/.cache/torch_extensions/py38_cu113 as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /home/yuze/.cache/torch_extensions/py38_cu113/_hash_encoder/build.ninja...
Building extension module _hash_encoder...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/include -isystem /home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/include/TH -isystem /home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/yuze/anaconda3/envs/mlp_maps/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -std=c++14 -allow-unsupported-compiler -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -c /home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o
FAILED: hashencoder.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/include -isystem /home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/include/TH -isystem /home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/yuze/anaconda3/envs/mlp_maps/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -std=c++14 -allow-unsupported-compiler -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -c /home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(25): error: no instance of overloaded function "atomicAdd" matches the argument list
argument types are: (__half *, c10::Half)
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=2U, C=2U, N_C=2U]"
(687): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double, D=2U]"
(721): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=2U, C=4U, N_C=2U]"
(692): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double, D=2U]"
(721): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=2U, C=8U, N_C=2U]"
(698): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double, D=2U]"
(721): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=3U, C=2U, N_C=2U]"
(687): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double, D=3U]"
(722): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=3U, C=4U, N_C=2U]"
(692): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double, D=3U]"
(722): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=3U, C=8U, N_C=2U]"
(698): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double, D=3U]"
(722): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=double]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=2U, C=2U, N_C=2U]"
(687): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float, D=2U]"
(721): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=2U, C=4U, N_C=2U]"
(692): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float, D=2U]"
(721): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=2U, C=8U, N_C=2U]"
(698): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float, D=2U]"
(721): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=3U, C=2U, N_C=2U]"
(687): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float, D=3U]"
(722): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=3U, C=4U, N_C=2U]"
(692): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float, D=3U]"
(722): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float]"
(817): here
/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced
detected during:
instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, const scalar_t *, const scalar_t *, scalar_t *, uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=3U, C=8U, N_C=2U]"
(698): here
instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float, D=3U]"
(722): here
instantiation of "void hash_encode_second_backward_cuda(const scalar_t *, const scalar_t *, const scalar_t *, const int *, uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t *, const scalar_t *, scalar_t *, scalar_t *) [with scalar_t=float]"
(817): here
1 error detected in the compilation of "/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/src/hashencoder.cu".
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build
subprocess.run(
File "/home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "train_net.py", line 109, in <module>
main()
File "train_net.py", line 105, in main
train(cfg)
File "train_net.py", line 25, in train
network = make_network(cfg)
File "/home/yuze/Documents/project/mlp_maps/lib/networks/make_network.py", line 8, in make_network
network = imp.load_source(module, path).Network()
File "/home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/imp.py", line 171, in load_source
module = _load(spec)
File "<frozen importlib._bootstrap>", line 702, in _load
File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 843, in exec_module
File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "lib/networks/dymap.py", line 9, in <module>
from lib.csrc.hashencoder import HashEncoder
File "/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/__init__.py", line 1, in <module>
from .hashgrid import HashEncoder
File "/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/hashgrid.py", line 12, in <module>
from .backend import _backend
File "/home/yuze/Documents/project/mlp_maps/lib/csrc/hashencoder/backend.py", line 10, in <module>
_backend = load(name='_hash_encoder',
File "/home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load
return _jit_compile(
File "/home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1537, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/yuze/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension '_hash_encoder'
And I am running the code on a 1070Ti, with cuda-11.3 and torch-12.1. I guess whether the compilation failed due to a low graphics card (1070TI) that does not support atomicAdd operation. Or am I not setting up CUDA_HOME correctly, as @mra-h said? Can you provide more information? Thank you.
https://github.com/CharlesShang/DCNv2/issues/5 It looks like that you do not use a correct CUDA version.
Hello,
Thanks for the awesome work. I am trying to execute the training script however I get the following error:
from .backend import _backend File "/home/manuel/projects/mlp_maps/lib/csrc/hashencoder/backend.py", line 10, in <module> _backend = load(name='_hash_encoder', File "/home/manuel/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return _jit_compile( File "/home/manuel/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile _write_ninja_file_and_build_library( File "/home/manuel/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1537, in _write_ninja_file_and_build_library _run_ninja_build( File "/home/manuel/anaconda3/envs/mlp_maps/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension '_hash_encoder'
Any idea what this is due to? I am running it on a 2080Ti
Hi, I have the same error on the 2080Ti. Could you please show me the specific modifications you made? I've been struggling to solve this problem. Thank you very much. @mra-h
Hi. I dont remember exactly which steps I took. But to export CUDA_HOME you need something like:
export CUDA_HOME=path/to/cuda
Hello,
Thanks for the awesome work. I am trying to execute the training script however I get the following error:
Any idea what this is due to? I am running it on a 2080Ti