nerfstudio-project / nerfacc

A General NeRF Acceleration Toolbox in PyTorch.
https://www.nerfacc.com/
Other
1.37k stars 113 forks source link

Thank for your contribution! #283

Open limaolin2017 opened 5 months ago

limaolin2017 commented 5 months ago

I had the following question the first time I ran this:

Error file:'Traceback (most recent call last): File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/_backend.py", line 53, in from nerfacc import csrc as _C ImportError: cannot import name 'csrc' from 'nerfacc' (/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/init.py)

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1717, in _run_ninja_build subprocess.run( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/subprocess.py", line 528, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 201, in app.run(main) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 312, in run _run_main(main, args) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 140, in main rendered_chunks = render_rays(nerf_models, File "/gpfs/home/mli/banmo/nnutils/rendering.py", line 132, in render_rays ray_indices, t_starts, t_ends = estimator.sampling( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, kwargs) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/estimators/occgrid.py", line 164, in sampling intervals, samples, = traverse_grids( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, *kwargs) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/grid.py", line 158, in traverse_grids t_mins, t_maxs, hits = ray_aabb_intersect(rays_o, rays_d, aabbs) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(args, kwargs) File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/grid.py", line 43, in ray_aabb_intersect t_mins, t_maxs, hits = _C.ray_aabb_intersect( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/init.py", line 11, in call_cuda from ._backend import _C File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/_backend.py", line 61, in _C = load( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1124, in load return _jit_compile( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1337, in _jit_compile _write_ninja_file_and_build_library( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1449, in _write_ninja_file_and_build_library _run_ninja_build( File "/home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension 'nerfacc_cuda': [1/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu -o grid.cuda.o FAILED: grid.cuda.o /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu -o grid.cuda.o /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/grid.cu:4:10: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory

include <ATen/cuda/CUDAGeneratorImpl.h>

      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compilation terminated. [2/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu -o pdf.cuda.o FAILED: pdf.cuda.o /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu -o pdf.cuda.o /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/pdf.cu:4:10: fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory

include <ATen/cuda/CUDAGeneratorImpl.h>

      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compilation terminated. [3/6] g++ -MMD -MF nerfacc.o.d -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/nerfacc.cpp -o nerfacc.o [4/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/camera.cu -o camera.cuda.o [5/6] /home/mli/.conda/envs/banmo-cu113/bin/nvcc -DTORCH_EXTENSION_NAME=nerfacc_cuda -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/TH -isystem /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/torch/include/THC -isystem /home/mli/.conda/envs/banmo-cu113/include -isystem /home/mli/.conda/envs/banmo-cu113/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 -std=c++14 -c /home/mli/.conda/envs/banmo-cu113/lib/python3.9/site-packages/nerfacc/cuda/csrc/scan.cu -o scan.cuda.o ninja: build stopped: subcommand failed.',

info:' cuda 113 Torch 1.10 nerfacc 0.5.3 '

liruilong940607 commented 4 months ago

I see this error fatal error: ATen/cuda/CUDAGeneratorImpl.h: No such file or directory. For me this file lives in:

/home/ruilongli/anaconda3/envs/nerfacc/lib/python3.9/site-packages/torch/include/ATen/cuda/CUDAGeneratorImpl.h

You might want to check if your torch is installed in this conda env correctly.

Alternately, you could install our prebuilt wheels from here

limaolin2017 commented 4 months ago

Hi, does the latest version of nerfacc support torch 1.10?

liruilong940607 commented 4 months ago

Yes

limaolin2017 commented 4 months ago

I changed another conda env.

env info:' cuda 11.3, Torch 1.11'

I encounter some errors:'Traceback (most recent call last): File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 201, in app.run(main) File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/absl/app.py", line 308, in run _run_main(main, args) File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/absl/app.py", line 254, in _run_main sys.exit(main(argv)) File "/gpfs/home/mli/banmo/scripts/visualize/nvs.py", line 140, in main rendered_chunks = render_rays(nerf_models, File "/gpfs/home/mli/banmo/nnutils/rendering.py", line 136, in render_rays ray_indices, t_starts, t_ends = estimator.sampling( File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/estimators/occgrid.py", line 164, in sampling intervals, samples, = traverse_grids( File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, *kwargs) File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/grid.py", line 165, in traverse_grids intervals, samples, termination_planes = _C.traverse_grids( File "/home/mli/.conda/envs/torch-110/lib/python3.9/site-packages/nerfacc/cuda/init.py", line 13, in call_cuda return getattr(_C, name)(args, kwargs) RuntimeError: CUDA error: an illegal memory access was encountered'

liruilong940607 commented 4 months ago

Could you dump the input of this function here and share it so I can take a look?

limaolin2017 commented 4 months ago

traverse_grids inputs: {'rays_o': tensor([[-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], ..., [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255], [-0.2437, -0.0069, 0.0255]], device='cuda:0'), 'rays_d': tensor([[ 0.9839, 0.1560, -0.2869], [ 0.9840, 0.1560, -0.2848], [ 0.9841, 0.1559, -0.2827], ..., [ 0.9838, 0.1237, -0.3298], [ 0.9839, 0.1236, -0.3277], [ 0.9840, 0.1235, -0.3256]], device='cuda:0'), 'binaries': tensor([[[[False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], ..., [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False], [False, False, False, ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     ...,

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]],

     [[False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      ...,
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False],
      [False, False, False,  ..., False, False, False]]]]), 'aabbs': tensor([[0.0000, 0.0000, 0.0000, 0.3000, 0.3000, 0.3000]], device='cuda:0'), 'near_planes': tensor([0.2000, 0.2000, 0.2000,  ..., 0.2000, 0.2000, 0.2000], device='cuda:0'), 'far_planes': tensor([1., 1., 1.,  ..., 1., 1., 1.], device='cuda:0'), 'step_size': 0.001, 'cone_angle': 0.0}
liruilong940607 commented 4 months ago

I can't use these pasted outputs to examine the code.. Could you save them into a file (say .pth or .npz) and upload it here?

limaolin2017 commented 4 months ago

Thank you for checking!

inputs.pth.zip

limaolin2017 commented 4 months ago

I have checked the NaN values, shape, and type of input arguments, and there are no existing issues. Can you provide any suggestions on how to handle it?

limaolin2017 commented 4 months ago

I have checked the GPU memory usage, it is normal.

liruilong940607 commented 4 months ago

Hi I will check this issue after ECCV's ddl tmr!

ZitongLan commented 2 months ago

Having the same issue on the sampling function, returns File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/estimators/occgrid.py", line 164, in sampling intervals, samples, = traverse_grids( File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/grid.py", line 165, in traverse_grids intervals, samples, termination_planes = _C.traverse_grids( File "/home/ztlan/anaconda3/envs/nerf2/lib/python3.8/site-packages/nerfacc/cuda/init.py", line 13, in call_cuda return getattr(_C, name)(args, **kwargs) RuntimeError: CUDA error: an illegal memory access was encountered Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Is there any update on the issue? Thanks

ZitongLan commented 2 months ago

I have solved the issue. I didn't move my estimator to the GPU, so there is illegal memory access occurred.