But the tiny-cuda-nn wheel build loops forever, not failing, but also not succeeding, until the build times out.
I am following the installation instructions from nerfstudio here: https://github.com/nerfstudio-project/nerfstudio?tab=readme-ov-file#dependencies
Which coindices with the instructions in this tiny-cuda-nn repo. In fact, when I use a previous Docker image version, dromni/nerfstudio:0.1.16, with older version of the libraries and CUDA 11.7, it all works fine. The problematic Docker file is:
If I remove the installation of tiny-cuda-nn, everything else builds perfectly fine. Otherwise I get this log:
#5 174.6 Collecting git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
#5 174.6 Cloning https://github.com/NVlabs/tiny-cuda-nn/ to /tmp/pip-req-build-_rc_iady
#5 174.6 Running command git clone --filter=blob:none --quiet https://github.com/NVlabs/tiny-cuda-nn/ /tmp/pip-req-build-_rc_iady
#5 176.4 Resolved https://github.com/NVlabs/tiny-cuda-nn/ to commit c91138bcd4c6877c8d5e60e483c0581aafc70cce
#5 176.4 Running command git submodule update --init --recursive -q
#5 183.6 Preparing metadata (setup.py): started
#5 187.7 Preparing metadata (setup.py): finished with status 'done'
#5 187.9 Collecting ninja
#5 188.0 Downloading ninja-1.11.1.1-py2.py3-none-manylinux1_x86_64.manylinux_2_5_x86_64.whl (307 kB)
#5 188.1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 307.2/307.2 KB 2.4 MB/s eta 0:00:00
#5 188.2 Building wheels for collected packages: tinycudann
#5 188.2 Building wheel for tinycudann (setup.py): started
#5 278.6 Building wheel for tinycudann (setup.py): still running...
#5 592.7 Building wheel for tinycudann (setup.py): still running...
#5 777.5 Building wheel for tinycudann (setup.py): still running...
#5 1176.3 Building wheel for tinycudann (setup.py): still running...
#5 1270.0 Building wheel for tinycudann (setup.py): still running...
#5 1651.6 Building wheel for tinycudann (setup.py): still running...
#5 1917.0 Building wheel for tinycudann (setup.py): still running...
#5 2252.7 Building wheel for tinycudann (setup.py): still running...
#5 2339.5 Building wheel for tinycudann (setup.py): still running...
#5 2701.9 Building wheel for tinycudann (setup.py): still running...
#5 2940.4 Building wheel for tinycudann (setup.py): still running...
#5 3287.7 Building wheel for tinycudann (setup.py): still running...
#5 CANCELED
context canceled
ERROR: Job failed: execution took longer than 1h0m0s seconds
I passed the --verbose flag to pip and I got one numpy error early on (which does not make the job fail), and then looping through some warnings while the wheel tries to build indefinitely:
Numpy:
#6 156.8 Building wheels for collected packages: tinycudann
#6 156.8 Building wheel for tinycudann (setup.py): started
#6 156.8 Running command python setup.py bdist_wheel
#6 157.8
#6 157.8 A module that was compiled using NumPy 1.x cannot be run in
#6 157.8 NumPy 2.1.2 as it may crash. To support both 1.x and 2.x
#6 157.8 versions of NumPy, modules must be compiled with NumPy 2.0.
#6 157.8 Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.
#6 157.8
#6 157.8 If you are a user of the module, the easiest solution will be to
#6 157.8 downgrade to 'numpy<2' or try to upgrade the affected module.
#6 157.8 We expect that some modules will need time to support NumPy 2.
#6 157.8
#6 157.8 Traceback (most recent call last): File "<string>", line 2, in <module>
#6 157.8 File "<pip-setuptools-caller>", line 34, in <module>
#6 157.8 File "/tmp/pip-req-build-cm6ig4ie/bindings/torch/setup.py", line 9, in <module>
#6 157.8 import torch
#6 157.8 File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 1382, in <module>
#6 157.8 from .functional import * # noqa: F403
#6 157.8 File "/usr/local/lib/python3.10/dist-packages/torch/functional.py", line 7, in <module>
#6 157.8 import torch.nn.functional as F
#6 157.8 File "/usr/local/lib/python3.10/dist-packages/torch/nn/__init__.py", line 1, in <module>
#6 157.8 from .modules import * # noqa: F403
#6 157.8 File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/__init__.py", line 35, in <module>
#6 157.8 from .transformer import TransformerEncoder, TransformerDecoder, \
#6 157.8 File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py", line 20, in <module>
#6 157.8 device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
#6 157.8 /usr/local/lib/python3.10/dist-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at ../torch/csrc/utils/tensor_numpy.cpp:84.)
#6 157.8 device: torch.device = torch.device(torch._C._get_default_device()), # torch.device('cpu'),
Warnings loop:
#6 189.3 [6/10] /usr/local/cuda/bin/nvcc -I/tmp/pip-req-build-cm6ig4ie/include -I/tmp/pip-req-build-cm6ig4ie/dependencies -I/tmp/pip-req-build-cm6ig4ie/dependencies/cutlass/include -I/tmp/pip-req-build-cm6ig4ie/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-cm6ig4ie/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /tmp/pip-req-build-cm6ig4ie/src/object.cu -o /tmp/pip-req-build-cm6ig4ie/bindings/torch/src/object.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -std=c++17 --extended-lambda --expt-relaxed-constexpr -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -Xcompiler=-Wno-float-conversion -Xcompiler=-fno-strict-aliasing -gencode=arch=compute_90,code=compute_90 -gencode=arch=compute_90,code=sm_90 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=90 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_90_C -D_GLIBCXX_USE_CXX11_ABI=0
#6 189.3 /tmp/pip-req-build-cm6ig4ie/dependencies/fmt/include/fmt/core.h(288): warning #1675-D: unrecognized GCC pragma
#6 189.3
#6 189.3 /tmp/pip-req-build-cm6ig4ie/dependencies/fmt/include/fmt/core.h(288): warning #1675-D: unrecognized GCC pragma
#6 189.3
#6 241.3 [7/10] c++ -MMD -MF /tmp/pip-req-build-cm6ig4ie/bindings/torch/build/temp.linux-x86_64-3.10/tinycudann/bindings.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O2 -Wall -g -fstack-protector-strong -Wformat -Werror=format-security -g -fwrapv -O2 -g -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/tmp/pip-req-build-cm6ig4ie/include -I/tmp/pip-req-build-cm6ig4ie/dependencies -I/tmp/pip-req-build-cm6ig4ie/dependencies/cutlass/include -I/tmp/pip-req-build-cm6ig4ie/dependencies/cutlass/tools/util/include -I/tmp/pip-req-build-cm6ig4ie/dependencies/fmt/include -I/usr/local/lib/python3.10/dist-packages/torch/include -I/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -I/usr/local/lib/python3.10/dist-packages/torch/include/TH -I/usr/local/lib/python3.10/dist-packages/torch/include/THC -I/usr/local/cuda/include -I/usr/include/python3.10 -c -c /tmp/pip-req-build-cm6ig4ie/bindings/torch/tinycudann/bindings.cpp -o /tmp/pip-req-build-cm6ig4ie/bindings/torch/build/temp.linux-x86_64-3.10/tinycudann/bindings.o -std=c++17 -DTCNN_PARAMS_UNALIGNED -DTCNN_MIN_GPU_ARCH=90 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_90_C -D_GLIBCXX_USE_CXX11_ABI=0
#6 241.3 In file included from /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/Exceptions.h:14,
#6 241.3 from /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include/torch/python.h:11,
#6 241.3 from /usr/local/lib/python3.10/dist-packages/torch/include/torch/extension.h:9,
#6 241.3 from /tmp/pip-req-build-cm6ig4ie/bindings/torch/tinycudann/bindings.cpp:34:
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::LogSeverity>’:
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:2170:7: required from ‘class pybind11::enum_<tcnn::cpp::LogSeverity>’
#6 241.3 /tmp/pip-req-build-cm6ig4ie/bindings/torch/tinycudann/bindings.cpp:283:52: required from here
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::LogSeverity>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
#6 241.3 1496 | class class_ : public detail::generic_type {
#6 241.3 | ^~~~~~
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::Precision>’:
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:2170:7: required from ‘class pybind11::enum_<tcnn::cpp::Precision>’
#6 241.3 /tmp/pip-req-build-cm6ig4ie/bindings/torch/tinycudann/bindings.cpp:292:48: required from here
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::Precision>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<tcnn::cpp::Context>’:
#6 241.3 /tmp/pip-req-build-cm6ig4ie/bindings/torch/tinycudann/bindings.cpp:309:45: required from here
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<tcnn::cpp::Context>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h: In instantiation of ‘class pybind11::class_<Module>’:
#6 241.3 /tmp/pip-req-build-cm6ig4ie/bindings/torch/tinycudann/bindings.cpp:316:32: required from here
#6 241.3 /usr/local/lib/python3.10/dist-packages/torch/include/pybind11/pybind11.h:1496:7: warning: ‘pybind11::class_<Module>’ declared with greater visibility than its base ‘pybind11::detail::generic_type’ [-Wattributes]
Hi,
I am trying to create a Docker image for nerfstudio based on this one: https://hub.docker.com/layers/dromni/nerfstudio/1.1.4/images/sha256-ff0107a7db96bb8ee29c638729328b832b268b890c50f2a2ff25988bb84d4f75?context=explore
But the tiny-cuda-nn wheel build loops forever, not failing, but also not succeeding, until the build times out.
I am following the installation instructions from nerfstudio here: https://github.com/nerfstudio-project/nerfstudio?tab=readme-ov-file#dependencies Which coindices with the instructions in this tiny-cuda-nn repo. In fact, when I use a previous Docker image version, dromni/nerfstudio:0.1.16, with older version of the libraries and CUDA 11.7, it all works fine. The problematic Docker file is:
If I remove the installation of tiny-cuda-nn, everything else builds perfectly fine. Otherwise I get this log:
I passed the
--verbose
flag to pip and I got one numpy error early on (which does not make the job fail), and then looping through some warnings while the wheel tries to build indefinitely:Numpy:
Warnings loop:
I have attached a longer log output for more context tiny-cuda-nn-wheel-docker-log.txt
I cannot really make much sense of these logs, and I have ran out of ideas on how to debug this, so any help is very appreciated. Thank you!