Closed vitruvi closed 2 years ago
Hi,
thanks for trying out our package! But I'm surprised that the CUDA driver seems to be incompatible with the PyTorch version.
Let me propose two potential solutions. The first one would be to install the latest PyTorch version:
For this, you can open a terminal within jupyter lab, and run:
pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
or you run it within the notebook "1_inspect_data.ipynb":
!pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
Then, if you restart the kernel of the notebook, everything will work, hopefully.
If that still leads to the same error message, you could consider downgrading your CUDA driver on your system to version 11.3, and then re-building the container.
Let me know if it works or if you have any questions!
Konstantin
I have the same error and the pip install sadly does not help. I also can't downgrade from Cuda 11.7 to 11.3 as the driver does not support it. I use the GPUs without problems in a conda environment with cudatoolkit 11.3, so it seems to be a docker problem. I don't see a real solution though, as PyTorch also does not (yet) support Cuda 11.7...
Hi - yes that could be a Docker problem. @cblessing24, would you have an idea of how to solve it?
You are probably missing NVIDIA Container Toolkit. This package is necessary to use GPUs in Docker containers. Please install it and report back.
Problem persists in my case. Same as my first message after installing docker nvidia container toolkit.
Did you verify that you can use GPUs in a container after installing the toolkit? What is the output of the following command?
sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Yes, nvidia-smi works on docker.
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 470.129.06 Driver Version: 470.129.06 CUDA Version: 11.4 | ...
Can you post the whole output here please?
Okay, I think I was able to reproduce the issue. I changed my docker-compose.yml
file to this to fix it:
version: '3.4'
services:
jupyterlab:
image: sensorium
build:
context: .
volumes:
- .:/project
- ./notebooks:/notebooks
environment:
- JUPYTER_PASSWORD=
deploy:
resources:
reservations:
devices:
- capabilities: ["gpu"]
Sources: https://docs.docker.com/compose/gpu-support/ https://github.com/compose-spec/compose-spec/blob/master/deploy.md#devices
Gives the error below.
ERROR: The Compose file './docker-compose.yml' is invalid because: services.jupyterlab.deploy.resources.reservations value Additional properties are not allowed ('devices' was unexpected)
Which version of docker-compose are you using? I am using 2.6.1.
Yeah that is the problem. "apt" installs 1.25 version on ubuntu 20.04. docker-compose should be installed manually. Now it works out of the box.
docker-compose version 2.6.1 can be listed as a requirement on main page.
Great to hear that this problem is fixed! For other discussions, also feel free to join our slack community (link can be found here https://sensorium2022.net/home in the contact section).
After installing docker, docker-compose and exec'ing "docker-compose run -d -p 10101:8888 jupyterlab", when I run "1_inspect_data.ipynb", I get the error below.
"" RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx ""
My specs are Ubuntu 20.04, docker version 20.10.16, rtx 2070 graphics card and cuda installed on the system (Driver Version: 470.129.06 CUDA Version: 11.4) apart from docker. How can I solve this problem?