autonomousvision / monosdf

[NeurIPS'22] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction
MIT License
565 stars 52 forks source link

Error building extension '_hash_encoder' #19

Closed Liuyveg closed 2 years ago

Liuyveg commented 2 years ago

Thanks for your wonderful work. When running command to train monosdf, the error is reported. image I want to know how to solve it.

niujinshuchong commented 2 years ago

Hi, did you follow readme to install the cudatoolkit-dev? Can you share the full log of the error?

Liuyveg commented 2 years ago

Hi, did you follow readme to install the cudatoolkit-dev? Can you share the full log of the error?

image

niujinshuchong commented 2 years ago

Hi, maybe you can comment this line: https://github.com/autonomousvision/monosdf/blob/main/code/hashencoder/src/hashencoder.cu#L25 and try again.

Liuyveg commented 2 years ago

Hi, I met the same error after following readme to reinstall the environment and trying again. Ninja is required to load C++ extensions was reported when running the command at first time. image And I ran this command to install Ninja. pip install ninja It was weird that I tried again and met the error again.

niujinshuchong commented 2 years ago

Hi, did you solve the problem or could you share the full error log after installing ninja?

Liuyveg commented 2 years ago

Hi, I am still stuck in this problem. And here is the full error log. Thanks for your kindly help.

CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --nproc_per_node 1 --nnodes=1 --node_rank=0 training/exp_runner.py --conf confs/scannet_mlp.conf --scan_id 1 /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future. Use torchrun. Note that --use_env is set by default in torchrun. If your script expects --local_rank argument to be set, please change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for further instructions

warnings.warn( RANK and WORLD_SIZE in environ: 0/1 0 shell command : training/exp_runner.py --local_rank=0 --conf confs/scannet_mlp.conf --scan_id 1 Loading data ... Finish loading data. Data-set size: 465 RUNNING FOR 430 Detected CUDA files, patching ldflags Emitting ninja build file ./tmp_build/build.ninja... Building extension module _hash_encoder... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) [1/3] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/TH -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -std=c++14 -allow-unsupported-compiler -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS__ -c /home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o FAILED: hashencoder.cuda.o /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/TH -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -DCUDA_NO_HALF_OPERATORS -DCUDA_NO_HALF_CONVERSIONS -DCUDA_NO_BFLOAT16_CONVERSIONS -DCUDA_NO_HALF2_OPERATORS --expt-relaxed-constexpr -gencode=arch=compute_61,code=compute_61 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 -std=c++14 -allow-unsupported-compiler -UCUDA_NO_HALF_OPERATORS -UCUDA_NO_HALF_CONVERSIONS -UCUDA_NO_HALF2_OPERATORS -c /home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu -o hashencoder.cuda.o /home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(25): error: no instance of overloaded function "atomicAdd" matches the argument list argument types are: (half *, c10::Half)

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=2U, C=2U, N_C=2U]" (687): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=double, D=2U]" (721): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=double]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=2U, C=4U, N_C=2U]" (692): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=double, D=2U]" (721): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=double]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=2U, C=8U, N_C=2U]" (698): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=double, D=2U]" (721): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=double]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=3U, C=2U, N_C=2U]" (687): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=double, D=3U]" (722): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=double]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=3U, C=4U, N_C=2U]" (692): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=double, D=3U]" (722): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=double]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=double, D=3U, C=8U, N_C=2U]" (698): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=double, D=3U]" (722): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=double]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=2U, C=2U, N_C=2U]" (687): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=float, D=2U]" (721): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=float]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=2U, C=4U, N_C=2U]" (692): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=float, D=2U]" (721): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=float]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=2U, C=8U, N_C=2U]" (698): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=float, D=2U]" (721): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=float]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=3U, C=2U, N_C=2U]" (687): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=float, D=3U]" (722): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=float]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=3U, C=4U, N_C=2U]" (692): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=float, D=3U]" (722): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=float]" (817): here

/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu(513): warning: variable "results_grad" was declared but never referenced detected during: instantiation of "void kernel_grid_second_backward_embedding<scalar_t,D,C,N_C>(const scalar_t , const scalar_t , const scalar_t , const int , const scalar_t , const scalar_t , scalar_t , uint32_t, uint32_t, float, uint32_t) [with scalar_t=float, D=3U, C=8U, N_C=2U]" (698): here instantiation of "void kernel_grid_second_backward_wrapper<scalar_t,D>(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t ) [with scalar_t=float, D=3U]" (722): here instantiation of "void hash_encode_second_backward_cuda(const scalar_t , const scalar_t , const scalar_t , const int , uint32_t, uint32_t, uint32_t, uint32_t, float, uint32_t, __nv_bool, const scalar_t , const scalar_t , scalar_t , scalar_t *) [with scalar_t=float]" (817): here

1 error detected in the compilation of "/home/ailab/ailab/LY/monosdf/code/hashencoder/src/hashencoder.cu". [2/3] c++ -MMD -MF bindings.o.d -DTORCH_EXTENSION_NAME=_hash_encoder -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/TH -isystem /home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /home/ailab/anaconda3/envs/ly_monosdf/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -c /home/ailab/ailab/LY/monosdf/code/hashencoder/src/bindings.cpp -o bindings.o ninja: build stopped: subcommand failed. Traceback (most recent call last): File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build subprocess.run( File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "training/exp_runner.py", line 58, in trainrunner = MonoSDFTrainRunner(conf=opt.conf, File "/home/ailab/ailab/LY/monosdf/code/../code/training/monosdf_train.py", line 107, in init self.model = utils.get_class(self.conf.get_string('train.model_class'))(conf=conf_model) File "/home/ailab/ailab/LY/monosdf/code/../code/utils/general.py", line 17, in get_class m = import(module) File "/home/ailab/ailab/LY/monosdf/code/../code/model/network.py", line 140, in from hashencoder.hashgrid import _hash_encode, HashEncoder File "/home/ailab/ailab/LY/monosdf/code/../code/hashencoder/init.py", line 1, in from .hashgrid import HashEncoder File "/home/ailab/ailab/LY/monosdf/code/../code/hashencoder/hashgrid.py", line 12, in from .backend import _backend File "/home/ailab/ailab/LY/monosdf/code/../code/hashencoder/backend.py", line 10, in _backend = load(name='_hash_encoder', File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1202, in load return _jit_compile( File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1425, in _jit_compile _write_ninja_file_and_build_library( File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1537, in _write_ninja_file_and_build_library _run_ninja_build( File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build raise RuntimeError(message) from e RuntimeError: Error building extension '_hash_encoder' ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 8802) of binary: /home/ailab/anaconda3/envs/ly_monosdf/bin/python Traceback (most recent call last): File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in main() File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main launch(args) File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch run(args) File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/distributed/run.py", line 752, in run elastic_launch( File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/home/ailab/anaconda3/envs/ly_monosdf/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

training/exp_runner.py FAILED

Failures:

------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2022-10-11_17:20:58 host : ailab-All-Series rank : 0 (local_rank: 0) exitcode : 1 (pid: 8802) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================
niujinshuchong commented 2 years ago

Hi, did you try to comment this line: https://github.com/autonomousvision/monosdf/blob/main/code/hashencoder/src/hashencoder.cu#L25 ?

Liuyveg commented 2 years ago

Hi, did you try to comment this line: https://github.com/autonomousvision/monosdf/blob/main/code/hashencoder/src/hashencoder.cu#L25 ?

Yeah, it finally works. Thanks for your reminder.