Code gets stuck when updating occupancy grid

volcverse commented 2 years ago

Hi there, thanks for sharing your work!

When I was running the train_mlp_nerf.py, my code was always stuck at :here. And no Errors or Warnings occur.

Could you please help me about this problem? Any response will be greatly appreciated.

ENV: torch 1.12.1 cudatoolkit: 11.3 gcc: 10.3.0

volcverse commented 2 years ago

And when I shut down the training it always stops here.

Traceback (most recent call last): File "/home/zcy/Code/nerfacc/examples/train_mlp_nerf.py", line 170, in occupancy_grid.every_n_step( File "/home/zcy/anaconda3/envs/neuris/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, kwargs) File "/home/zcy/Code/nerfacc/nerfacc/grid.py", line 271, in every_n_step self._update( File "/home/zcy/anaconda3/envs/neuris/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, *kwargs) File "/home/zcy/Code/nerfacc/nerfacc/grid.py", line 224, in _update x = contract_inv( File "/home/zcy/anaconda3/envs/neuris/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(args, kwargs) File "/home/zcy/Code/nerfacc/nerfacc/contraction.py", line 101, in contract_inv ctype = type.to_cpp_version() File "/home/zcy/Code/nerfacc/nerfacc/contraction.py", line 62, in to_cpp_version return _C.ContractionTypeGetter(self.value) File "/home/zcy/Code/nerfacc/nerfacc/cuda/init.py", line 11, in call_cuda from ._backend import _C File "/home/zcy/Code/nerfacc/nerfacc/cuda/_backend.py", line 38, in _C = load_extention(name) File "/home/zcy/Code/nerfacc/nerfacc/cuda/_backend.py", line 25, in load_extention return load( File "/home/zcy/anaconda3/envs/neuris/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return _jit_compile( File "/home/zcy/anaconda3/envs/neuris/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1439, in _jit_compile baton.wait() File "/home/zcy/anaconda3/envs/neuris/lib/python3.8/site-packages/torch/utils/file_baton.py", line 42, in wait time.sleep(self.wait_seconds) KeyboardInterrupt

liruilong940607 commented 2 years ago

The first time running the code will takes some time to build the cuda part. I think you just need to wait for 1-2mins. It is a one time effort

zheruiqiu commented 2 years ago

Same problems here, and I saw "Setting up CUDA (This may take a few minutes the first time)". However, I've waited it to finish for centuries...

Is there any other way to install nerfacc?

Environment:

cudatoolkit 11.3.1
pytorch 1.12.0
GCC 9.4.0

The code is as follows:

import nerfacc
import torch

aabb = torch.tensor([0.0, 0.0, 0.0, 1.0, 1.0, 1.0], device="cuda:0")
rays_o = torch.rand((128, 3), device="cuda:0")
rays_d = torch.randn((128, 3), device="cuda:0")
rays_d = rays_d / rays_d.norm(dim=-1, keepdim=True)
t_min, t_max = nerfacc.ray_aabb_intersect(rays_o, rays_d, aabb)
print(t_min)

Traceback (most recent call last):
  File "test.py", line 8, in <module>
    t_min, t_max = nerfacc.ray_aabb_intersect(rays_o, rays_d, aabb)
  File "/***/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/***/lib/python3.8/site-packages/nerfacc/intersection.py", line 47, in ray_aabb_intersect
    t_min, t_max = _C.ray_aabb_intersect(rays_o, rays_d, aabb)
  File "/***/lib/python3.8/site-packages/nerfacc/cuda/__init__.py", line 11, in call_cuda
    from ._backend import _C
  File "/***/lib/python3.8/site-packages/nerfacc/cuda/_backend.py", line 31, in <module>
    _C = load(
  File "/***/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load
    return _jit_compile(
  File "/***/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1439, in _jit_compile
    baton.wait()
  File "/***/lib/python3.8/site-packages/torch/utils/file_baton.py", line 42, in wait
    time.sleep(self.wait_seconds)
KeyboardInterrupt

volcverse commented 2 years ago

Actually I dont even see the "setting up cuda" information. And I have been waiting for hours but nothing happens.

liruilong940607 commented 2 years ago

Could you try this solution? https://github.com/zhou13/neurvps/issues/1#issuecomment-820898095

The explanation is that PyTorch will create a file “lock” when building. if for some reason the python process is killed then the lock file will be there forever and PyTorch won’t clean it up.

zheruiqiu commented 2 years ago

Could you try this solution? zhou13/neurvps#1 (comment)

The explanation is that PyTorch will create a file “lock” when building. if for some reason the python process is killed then the lock file will be there forever and PyTorch won’t clean it up.

Thanks a lot! @liruilong940607 This works for me.

liruilong940607 commented 2 years ago

Close for now as the problem seems to be solved

volcverse commented 2 years ago

Thanks. It works for me, too. So excited about it that forgot to close the issue.... sry.

nerfstudio-project / nerfacc

Code gets stuck when updating occupancy grid #70