bennyguo / instant-nsr-pl

Neural Surface reconstruction based on Instant-NGP. Efficient and customizable boilerplate for your research projects. Train NeuS in 10min!
MIT License
856 stars 84 forks source link

Import Error when runing training code #75

Closed gnwekge78707 closed 1 year ago

gnwekge78707 commented 1 year ago

ubuntu torch1.12.0+cu113

command: python launch.py --config configs/neus-dtu.yaml --gpu 0 --train output:

/root/miniconda3/envs/daformer/lib/python3.8/site-packages/tinycudann/modules.py:52: UserWarning: tinycudann was built for lower compute capability (80) than the system's (86). Performance may be suboptimal.
  warnings.warn(f"tinycudann was built for lower compute capability ({cc}) than the system's ({system_compute_capability}). Performance may be suboptimal.")
Global seed set to 42
Using 16bit None Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
[rank: 0] Global seed set to 42
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 1 processes
----------------------------------------------------------------------------------------------------

You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
fatal: not a git repository (or any of the parent directories): .git
/root/nerf/instant-nsr-pl-main/utils/callbacks.py:76: UserWarning: Code snapshot is not saved. Please make sure you have git installed and are in a git repository.
  rank_zero_warn("Code snapshot is not saved. Please make sure you have git installed and are in a git repository.")

  | Name  | Type      | Params
------------------------------------
0 | model | NeuSModel | 25.2 M
------------------------------------
25.2 M    Trainable params
0         Non-trainable params
25.2 M    Total params
50.441    Total estimated model params size (MB)
Traceback (most recent call last):
  File "launch.py", line 125, in <module>
    main()
  File "launch.py", line 114, in main
    trainer.fit(system, datamodule=dm)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 608, in fit
    call._call_and_handle_interrupt(
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
    return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 88, in launch
    return function(*args, **kwargs)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 650, in _fit_impl
    self._run(model, ckpt_path=self.ckpt_path)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1112, in _run
    results = self._run_stage()
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1191, in _run_stage
    self._run_train()
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1214, in _run_train
    self.fit_loop.run()
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
    self._outputs = self.epoch_loop.run(self._data_fetcher)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
    self.advance(*args, **kwargs)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 204, in advance
    response = self.trainer._call_lightning_module_hook("on_train_batch_start", batch, batch_idx)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1356, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/root/nerf/instant-nsr-pl-main/systems/base.py", line 57, in on_train_batch_start
    update_module_step(self.model, self.current_epoch, self.global_step)
  File "/root/nerf/instant-nsr-pl-main/systems/utils.py", line 351, in update_module_step
    m.update_step(epoch, global_step)
  File "/root/nerf/instant-nsr-pl-main/models/neus.py", line 109, in update_step
    self.occupancy_grid.every_n_step(step=global_step, occ_eval_fn=occ_eval_fn, occ_thre=self.config.get('grid_prune_occ_thre', 0.01))
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/nerfacc/grid.py", line 271, in every_n_step
    self._update(
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/nerfacc/grid.py", line 224, in _update
    x = contract_inv(
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/nerfacc/contraction.py", line 101, in contract_inv
    ctype = type.to_cpp_version()
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/nerfacc/contraction.py", line 62, in to_cpp_version
    return _C.ContractionTypeGetter(self.value)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/nerfacc/cuda/__init__.py", line 11, in call_cuda
    from ._backend import _C
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/nerfacc/cuda/_backend.py", line 85, in <module>
    _C = load(
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load
    return _jit_compile(
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1450, in _jit_compile
    return _import_module_from_library(name, build_directory, is_python_module)
  File "/root/miniconda3/envs/daformer/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1844, in _import_module_from_library
    module = importlib.util.module_from_spec(spec)
ImportError: /root/.cache/torch_extensions/py38_cu113/nerfacc_cuda/nerfacc_cuda.so: cannot open shared object file: No such file or directory
Epoch 0: : 0it [00:06, ?it/s]
bennyguo commented 1 year ago

It seems that nerfacc is not correctly compiled. I suggest you install from source via

pip install git+https://github.com/KAIR-BAIR/nerfacc.git@0.3.3