NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more
https://nvlabs.github.io/instant-ngp
Other
15.98k stars 1.92k forks source link

Build fail - recipe for target 'testbed' failed #347

Closed kajc10 closed 2 years ago

kajc10 commented 2 years ago

Build fails on headless container, any ideas what could be the problem? Specs: Ubuntu 18.04.6 LTS CUDA 11.6 Tesla V100 (DGX station 4pcs) Python 3.9.7 cmake 3.22.3 gcc/g++ 7.5.0

root@610d9320bd51:~/workdir/instant-ngp# cmake . -B build -DNGP_BUILD_WITH_GUI=off

-- !!! Warning OptiX_INSTALL_DIR not set in environment. using default
-- OptiX_INSTALL_DIR value: /usr/local/NVIDIA-OptiX-SDK-7.3.0-linux64-x86_64
-- pybind11 v2.7.1
CMake Warning (dev) at /root/workdir/cmake-3.22.3/share/cmake-3.22/Modules/CMakeDependentOption.cmake:84 (message):
  Policy CMP0127 is not set: cmake_dependent_option() supports full Condition
  Syntax.  Run "cmake --help-policy CMP0127" for policy details.  Use the
  cmake_policy command to set the policy and suppress this warning.
Call Stack (most recent call first):
  dependencies/pybind11/CMakeLists.txt:98 (cmake_dependent_option)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Targeting GPU architectures: 70;70;70;70
CMake Warning at dependencies/tiny-cuda-nn/CMakeLists.txt:112 (message):
  Fully fused MLPs do not support GPU architectures of 70 or less.  Falling
  back to CUTLASS MLPs.  Remove GPU architectures 70 and lower to allow
  maximum performance

root@610d9320bd51:~/workdir/instant-ngp# cmake --build build --config RelWithDebInfo -j 16 ...long output... end of output:

collect2: error: ld returned 1 exit status
CMakeFiles/testbed.dir/build.make:98: recipe for target 'testbed' failed
make[2]: *** [testbed] Error 1
CMakeFiles/Makefile2:192: recipe for target 'CMakeFiles/testbed.dir/all' failed
make[1]: *** [CMakeFiles/testbed.dir/all] Error 2
Makefile:90: recipe for target 'all' failed
make: *** [all] Error 2

The full output is too long, I attach it as a link: error.txt

leventt commented 2 years ago

Do you maybe have multiple CUDA SDK installations on your system?

Can you print what you see when you run: nvcc -v

I would check the build env variables and perhaps set something like this explicitly to rule it out:

export CUDA_HOME=/usr/local/cuda
export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64"
export LIBRARY_PATH="$LIBRARY_PATH:/usr/local/cuda/lib64"

What may be happening here is that CMAKE may be finding an old CUDA SDK on your system and can not link some of the symbols from newer CUDA versions.

kajc10 commented 2 years ago

Yes, I did have some problems with CUDA. (I'm working on a remote container and at every reset my env variables were lost -> nvcc -V showed CUDA 9.1 instead of 11.6. To solve it I only wrote the following lines to .bash-profile) export PATH=/usr/local/cuda-11.6/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-11.6/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Now issuing your snippet, the build completes, thanks.