Open limaolin2017 opened 5 months ago
I see this error fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory
. For me this file lives in:
/home/ruilongli/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/include/ATen/cuda/CUDAGeneratorImpl.h
You might want to check if your torch is installed in this conda env correctly.
Alternately, you could install our prebuilt wheels from here
Hi, does the latest version of nerfacc support torch 1.10?
Yes
I changed another conda env.
env info:' cuda 11.3, Torch 1.11'
I encounter some errors:'Traceback (most recent call last):
File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 201, in
Could you dump the input of this function here and share it so I can take a look?
traverse_grids inputs: {'rays_o': tensor([[-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], ..., [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255]], device='cuda:0'), 'rays_d': tensor([[ 0.9839, 0.1560, -0.2869], [ 0.9840, 0.1560, -0.2848], [ 0.9841, 0.1559, -0.2827], ..., [ 0.9838, 0.1237, -0.3298], [ 0.9839, 0.1236, -0.3277], [ 0.9840, 0.1235, -0.3256]], device='cuda:0'), 'binaries': tensor([[[[False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], ..., [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False]],
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
...,
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],
[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]]]), 'aabbs': tensor([[0.0000, 0.0000, 0.0000, 0.3000, 0.3000, 0.3000]], device='cuda:0'), 'near_planes': tensor([0.2000, 0.2000, 0.2000, ..., 0.2000, 0.2000, 0.2000], device='cuda:0'), 'far_planes': tensor([1., 1., 1., ..., 1., 1., 1.], device='cuda:0'), 'step_size': 0.001, 'cone_angle': 0.0}
I can't use these pasted outputs to examine the code.. Could you save them into a file (say .pth or .npz) and upload it here?
Thank you for checking!
I have checked the NaN values, shape, and type of input arguments, and there are no existing issues. Can you provide any suggestions on how to handle it?
I have checked the GPU memory usage, it is normal.
Hi I will check this issue after ECCV's ddl tmr!
Having the same issue on the sampling function, returns
File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/estimators/occgrid.py", line 164, in sampling
intervals, samples, = traverse_grids(
File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, *kwargs)
File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/grid.py", line 165, in traverse_grids
intervals, samples, termination_planes = _C.traverse_grids(
File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/cuda/init.py", line 13, in call_cuda
return getattr(_C, name)(args, **kwargs)
RuntimeError: CUDA error: an illegal memory access was encountered
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
Is there any update on the issue? Thanks
I have solved the issue. I didn't move my estimator to the GPU, so there is illegal memory access occurred.
I had the following question the first time I ran this:
Error file:'Traceback (most recent call last): File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/_backend.py", line 53, in
from nerfacc import csrc as _C
ImportError: cannot import name 'csrc' from 'nerfacc' (/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/init.py)
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1717, in _run_ninja_build subprocess.run( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/subprocess.py", line 528, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 201, in
app.run(main)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 140, in main
rendered_chunks = render_rays(nerf_models,
File "/gpfs/home/mli/banmo/nnutils/rendering.py", line 132, in render_rays
ray_indices, t_starts, t_ends = estimator.sampling(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, kwargs)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/estimators/occgrid.py", line 164, in sampling
intervals, samples, = traverse_grids(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, *kwargs)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/grid.py", line 158, in traverse_grids
t_mins, t_maxs, hits = ray_aabb_intersect(rays_o, rays_d, aabbs)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(args, kwargs)
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/grid.py", line 43, in ray_aabb_intersect
t_mins, t_maxs, hits = _C.ray_aabb_intersect(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/init.py", line 11, in call_cuda
from ._backend import _C
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/_backend.py", line 61, in
_C = load(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1124, in load
return _jit_compile(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1337, in _jit_compile
_write_ninja_file_and_build_library(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1449, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'nerfacc_cuda': [1/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu -o grid.cuda.o
FAILED: grid.cuda.o
/home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu -o grid.cuda.o
/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu:4:10: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory
include <ATen/cuda/CUDAGeneratorImpl.h>
compilation terminated. [2/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu -o pdf.cuda.o FAILED: pdf.cuda.o /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu -o pdf.cuda.o /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu:4:10: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory
include <ATen/cuda/CUDAGeneratorImpl.h>
compilation terminated. [3/6] g++ -MMD -MF nerfacc.o.d -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/nerfacc.cpp -o nerfacc.o [4/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/camera.cu -o camera.cuda.o [5/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/scan.cu -o scan.cuda.o ninja: build stopped: subcommand failed.',
info:' cuda 113 Torch 1.10 nerfacc 0.5.3 '