Closed srama2512 closed 4 years ago
If you system has a non-standard EGL install, i.e. if you need to do something like export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/nvidia-opengl:${LD_LIBRARY_PATH}
, you will likely need to mount /usr/lib/x86_64-linux-gnu/nvidia-opengl
(add -v /usr/lib/x86_64-linux-gnu/nvidia-opengl
) and set LD_LIBRARY_PATH
in the docker container also.
Thanks! It works now.
If you system has a non-standard EGL install, i.e. if you need to do something like
export LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/nvidia-opengl:${LD_LIBRARY_PATH}
, you will likely need to mount/usr/lib/x86_64-linux-gnu/nvidia-opengl
(add-v /usr/lib/x86_64-linux-gnu/nvidia-opengl
) and setLD_LIBRARY_PATH
in the docker container also.
Hi, @erikwijmans I faced the same issue as @srama2512 described above
But the path that you suggested to mount -v /usr/lib/x86_64-linux-gnu/nvidia-opengl
is missing on my machine. (It looks like the nvidia-opengl
is not installed on the machine)
Could you, please, explain to me what this library is used for and how can I install it.
I'm trying to google nvidia-opengl
but unable to find
UPD: I tried the suggestions listed here, but nothing worked on my machine. Also, I created a new GPU instance on the cloud and following this comment navigated to https://hub.docker.com/r/nvidia/cudagl and run all installation commands listed in Dockerfiles in section CUDA 10.1 update 2 + OpenGL (glvnd 1.2) (10.1/base/Dockerfile) + (glvnd/devel/Dockerfile) but still get the error described above.
Would be very grateful if somebody could help me to resolve this issue or provide the list of instructions you run to set up the machine.
Hi @rpartsey , I met the same problem as you did. May I know if you have solved it? Thanks very much.
Hi @rpartsey , I met the same problem as you did. May I know if you have solved it? Thanks very much.
Hi @vincent341 Yes, I faced the same problem. The root cause was incomplete CUDA installation.
Some packages require CUDA development tools (that if I'm not mistaken should be properly installed either on your computer(host) or inside the docker container).
But base
nvidia docker images doesn't include them.
See Overview of Images section https://hub.docker.com/r/nvidia/cuda/.
Inspired by this Stack Overflow response, I used devel
docker image as an example and added the following RUN
command that solved the issue for me
FROM fairembodied/habitat-challenge:testing_2020_habitat_base_docker
ARG TORCH_CUDA_ARCH_LIST="6.0 6.1 7.0+PTX 7.5+PTX"
RUN apt-get update && apt-get install -y --no-install-recommends \
cuda-nvml-dev-$CUDA_PKG_VERSION \
cuda-command-line-tools-$CUDA_PKG_VERSION \
cuda-nvprof-$CUDA_PKG_VERSION \
cuda-npp-dev-$CUDA_PKG_VERSION \
cuda-libraries-dev-$CUDA_PKG_VERSION \
cuda-minimal-build-$CUDA_PKG_VERSION \
libcublas-dev=10.2.1.243-1 \
libnccl-dev=$NCCL_VERSION-1+cuda10.1 \
&& apt-mark hold libnccl-dev \
&& rm -rf /var/lib/apt/lists/*
# ...
Hi @rpartsey ,
Thanks very much for your instructions. Let me try. I'm still struggling with it now.
I followed the instructions to build the local docker file.
It built successfully, but local testing via
./test_locally_pointnav_rgbd.sh
resulted in the following error:I created an interactive session inside the docker via:
nvidia-smi
worked:Running a simple pytorch code on the GPU also worked: