Open White-waluigi opened 3 weeks ago
Found out the most likely culprit for this stuff not working is the fact that despite using a 11.8 cuda container, nvida container toolkit forces cuda 12.4 (the host version) despite my driver 535 having official support for 11.8. According to Nvidia, this is a feature, not a bug
replacing the Dockerfile with this:
# Reference:
# https://github.com/cvpaperchallenge/Ascender
# https://github.com/nerfstudio-project/nerfstudio
FROM nvidia/cuda:12.4.0-devel-ubuntu22.04
ARG USER_NAME=dreamer
ARG GROUP_NAME=dreamers
ARG UID=1000
ARG GID=1000
# Set compute capability for nerfacc and tiny-cuda-nn
# See https://developer.nvidia.com/cuda-gpus and limit number to speed-up build
ENV TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0 7.5 8.0 8.6 8.9 9.0+PTX"
ENV TCNN_CUDA_ARCHITECTURES=90;89;86;80;75;70;61;60
# Speed-up build for RTX 30xx
# ENV TORCH_CUDA_ARCH_LIST="8.6"
# ENV TCNN_CUDA_ARCHITECTURES=86
# Speed-up build for RTX 40xx
# ENV TORCH_CUDA_ARCH_LIST="8.9"
# ENV TCNN_CUDA_ARCHITECTURES=89
ENV CUDA_HOME=/usr/local/cuda
ENV PATH=${CUDA_HOME}/bin:/home/${USER_NAME}/.local/bin:${PATH}
ENV LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}
ENV LIBRARY_PATH=${CUDA_HOME}/lib64/stubs:${LIBRARY_PATH}
# apt install by root user
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
build-essential \
curl \
git \
libegl1-mesa-dev \
libgl1-mesa-dev \
libgles2-mesa-dev \
libglib2.0-0 \
libsm6 \
libxext6 \
libxrender1 \
python-is-python3 \
python3.10-dev \
python3-pip \
wget \
&& rm -rf /var/lib/apt/lists/*
# Change user to non-root user
RUN groupadd -g ${GID} ${GROUP_NAME} \
&& useradd -ms /bin/sh -u ${UID} -g ${GID} ${USER_NAME}
USER ${USER_NAME}
RUN pip install --upgrade pip setuptools==69.5.1 ninja
RUN pip install torch torchvision
# Install nerfacc and tiny-cuda-nn before installing requirements.txt
# because these two installations are time consuming and error prone
RUN pip install git+https://github.com/KAIR-BAIR/nerfacc.git@v0.5.2
RUN pip install git+https://github.com/NVlabs/tiny-cuda-nn.git#subdirectory=bindings/torch
COPY requirements.txt /tmp
RUN cd /tmp && pip install -r requirements.txt
WORKDIR /home/${USER_NAME}/threestudio
Type nvidia-smi to find your cuda version and replace it in the FROM Line Got me past the error. It's downloading right now, will report later once this stuff is done downloading I had to update to nvidia driver 550 (to get cuda 12.4, 12.2 does not work) to make it work. Ubuntu keeps trying to downgrade it though
I am installing on ubuntu server 24.04 it installed 12.4 by default
I've installed the docker version of this, but I am getting the following Error:
ImportError: /home/dreamer/.local/lib/python3.10/site-packages/tinycudann_bindings/_86_C.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops5zeros4callEN3c108ArrayRefINS2_6SymIntEEENS2_8optionalINS2_10ScalarTypeEEENS6_INS2_6LayoutEEENS6_INS2_6DeviceEEENS6_IbEE
I guessed this is an issiue with the cuda version (which is 12.2). I've tried downgrading the driver but I can only install 535 which comes with 12.2 by default. Every other driver doesn't work at all. 550 doesn't either.
So is it even possible to run this project with a consumer GPU? Has anybody been able to make it work?