Closed antithing closed 1 year ago
Uninstalling nerfacc and using pip install nerfacc==0.3.3
has that solved, but gives a different error:
D:\NERF\NEUS\instant-nsr-pl>python launch.py --config configs/neus-dtu.yaml --gpu 0 --train
Global seed set to 42
Using 16bit None Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
------------------------------------
0 | model | NeuSModel | 28.0 M
------------------------------------
28.0 M Trainable params
0 Non-trainable params
28.0 M Total params
55.913 Total estimated model params size (MB)
( ● ) NerfAcc: Setting up CUDA (This may take a few minutes the first time)
C:/Users/B/AppData/Local/Programs/Python/Python39/lib/site-packages/torch/include\ATen/core/dispatch/OperatorEntry.h(270): note: see reference to class template instantiation 'ska::flat_hash_map<c10::DispatchKey,std::list<c10::impl::AnnotatedKernel,std::allocator<c10::impl::AnnotatedKernel>>,std::hash<c10::DispatchKey>,std::equal_to<K>,std::allocator<std::pair<K,V>>>' being compiled
with
[
K=c10::DispatchKey,
V=std::list<c10::impl::AnnotatedKernel,std::allocator<c10::impl::AnnotatedKernel>>
]
[10/10] "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64/link.exe" cdf.cuda.o contraction.cuda.o intersection.cuda.o pack.cuda.o pybind.cuda.o ray_marching.cuda.o render_transmittance.cuda.o render_transmittance_cub.cuda.o render_weight.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\Users\B\AppData\Local\Programs\Python\Python39\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\lib\x64" cudart.lib /out:nerfacc_cuda.pyd
FAILED: nerfacc_cuda.pyd
"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64/link.exe" cdf.cuda.o contraction.cuda.o intersection.cuda.o pack.cuda.o pybind.cuda.o ray_marching.cuda.o render_transmittance.cuda.o render_transmittance_cub.cuda.o render_weight.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda.lib -INCLUDE:?warp_size@cuda@at@@YAHXZ torch.lib /LIBPATH:C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\Users\B\AppData\Local\Programs\Python\Python39\libs "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\lib\x64" cudart.lib /out:nerfacc_cuda.pyd
Creating library nerfacc_cuda.lib and object nerfacc_cuda.exp
MSVCRT.lib(loadcfg.obj) : error LNK2001: unresolved external symbol __guard_eh_cont_table
MSVCRT.lib(loadcfg.obj) : error LNK2001: unresolved external symbol __guard_eh_cont_count
nerfacc_cuda.pyd : fatal error LNK1120: 2 unresolved externals
ninja: build stopped: subcommand failed.
Epoch 0: : 0it [01:33, ?it/s]
[W C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
I also needed to add the location of cl.exe
to my PATH, to avoid this error:
D:\NERF\NEUS\instant-nsr-pl>python launch.py --config configs/neus-dtu.yaml --gpu 0 --train
Global seed set to 42
Using 16bit None Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
`Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
You are using a CUDA device ('NVIDIA GeForce RTX 3090') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
| Name | Type | Params
------------------------------------
0 | model | NeuSModel | 28.0 M
------------------------------------
28.0 M Trainable params
0 Non-trainable params
28.0 M Total params
55.913 Total estimated model params size (MB)
C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py:359: UserWarning: Error
checking compiler version for cl: [WinError 2] The system cannot find the file specified
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
( ● ) NerfAcc: Setting up CUDA (This may take a few minutes the first time)INFO: Could not find files for the given pattern(s).
Traceback (most recent call last):
File "D:\NERF\NEUS\instant-nsr-pl\launch.py", line 125, in <module>
main()
File "D:\NERF\NEUS\instant-nsr-pl\launch.py", line 114, in main
trainer.fit(system, datamodule=dm)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 608, in fit
call._call_and_handle_interrupt(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\call.py", line 38, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 650, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1112, in _run
results = self._run_stage()
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1191, in _run_stage
self._run_train()
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1214, in _run_train
self.fit_loop.run()
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\loop.py", line 199, in run
self.advance(*args, **kwargs)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\fit_loop.py", line 267, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\loop.py", line 199, in run
self.advance(*args, **kwargs)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\loops\epoch\training_epoch_loop.py", line 204, in advance
response = self.trainer._call_lightning_module_hook("on_train_batch_start", batch, batch_idx)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 1356, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "D:\NERF\NEUS\instant-nsr-pl\systems\base.py", line 57, in on_train_batch_start
update_module_step(self.model, self.current_epoch, self.global_step)
File "D:\NERF\NEUS\instant-nsr-pl\systems\utils.py", line 351, in update_module_step
m.update_step(epoch, global_step)
File "D:\NERF\NEUS\instant-nsr-pl\models\neus.py", line 109, in update_step
self.occupancy_grid.every_n_step(step=global_step, occ_eval_fn=occ_eval_fn, occ_thre=self.config.get('grid_prune_occ_thre', 0.01))
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\nerfacc\grid.py", line 271, in every_n_step
self._update(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\nerfacc\grid.py", line 224, in _update
x = contract_inv(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\nerfacc\contraction.py", line 101, in contract_inv
ctype = type.to_cpp_version()
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\nerfacc\contraction.py", line 62, in to_cpp_version
return _C.ContractionTypeGetter(self.value)
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\nerfacc\cuda\__init__.py", line 11, in call_cuda
from ._backend import _C
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\nerfacc\cuda\_backend.py", line 85, in <module>
_C = load(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1284, in load
return _jit_compile(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1509, in _jit_compile
_write_ninja_file_and_build_library(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 1611, in _write_ninja_file_and_build_library
_write_ninja_file_to_build_library(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 2048, in _write_ninja_file_to_build_library
_write_ninja_file(
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\utils\cpp_extension.py", line 2188, in _write_ninja_file
cl_paths = subprocess.check_output(['where',
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 424, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "C:\Users\B\AppData\Local\Programs\Python\Python39\lib\subprocess.py", line 528, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['where', 'cl']' returned non-zero exit status 1.
Epoch 0: : 0it [00:15, ?it/s]
[W C:\actions-runner\_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\CudaIPCTypes.cpp:15] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]
using: C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.16.27023\bin\HostX64\x64
I have solved this by installing visual studio 2019, and adding these lines to the PATH:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64 C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\Common7\IDE
However I have another issue, for which I will open another thread.
Hi, thank you for making this code available! I am running the example, and I see the following error:
python launch.py --config configs/neus-dtu.yaml --gpu 0 --train
Gives me:What might be happening here? Is this a version mismatch?
Thanks!