Open claudio4 opened 1 year ago
I believe that the current problem comes from the fact that Racher Destop's docker runs on top of busybox and all the drivers that Nvidia has put out are based on Ubuntu. Is there a way to run Rancher Desktop's docker daemon on Ubuntu 20.04?
I believe that the current problem comes from the fact that Racher Destop's docker runs on top of busybox and all the drivers that Nvidia has put out are based on Ubuntu.
Yes. Alpine uses musl
and Ubuntu uses glibc
. You can install a glibc compatibility library on Alpine, but I don't know if this will give you CUDA.
The best I can find is https://arto.s3.amazonaws.com/notes/cuda. If you try this out and get CUDA running with Rancher Desktop, then please leave a note here with what you did!
Is there a way to run Rancher Desktop's docker daemon on Ubuntu 20.04?
No, this is not possible. Rancher Desktop makes specific assumptions about the VM images being used; they are custom-built for Rancher Desktop.
Is there any contribution documentation I can consult to estimate the possibility of adding support for Ubuntu as an alternative backend to Busybox?
I would also like to see Rancher Desktop on windows support GPUs. I am able to do this with Podman Desktop. For Podman Desktop I opened a shell prompt in the running container, e.g., podman-machine-default, and ran the following:
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo yum install -y nvidia-container-toolkit
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
nvidia-ctk cdi list
I wanted to try that with Rancher Desktop, but I didn't work get it to work -- I suppose in part because Rancher Desktop is based on an alpine linux image which uses musl. I found a glibc package and installed it, but ran into a problem that I can't remember now. Maybe it was that I still didn't have a nvidia-container-toolkit installation for alpine. I decided to keep using podman for now.
It seems there are a few possible solutions: (1) get nvidia to write a container-toolkit for alpine linux, (2) find the right way to install glibc and nvidia-container-toolkit in alpine, (3) change Rancher Desktop to use e.g., debian-slim.
It seems there are a few possible solutions: (1) get nvidia to write a container-toolkit for alpine linux,
This seems quite unlikely.
(2) find the right way to install glibc and nvidia-container-toolkit in alpine,
This would be the best short/medium term plan. I don't know if this is possible at all, but worth trying.
(3) change Rancher Desktop to use e.g., debian-slim.
This is not going to happen in the medium term (i.e. in 2024). I don't want to rule it out completely, but it would be a significant effort to do it right, and there is a lot of internal refactoring needed before we would attempt this.
i'd like to make kubernetes environment with rancher desktop`s k3s but it makes me use docker desktop. nowadays, gpu really needed. really sad.
@jandubois, Have a look at this project: https://github.com/sgerrand/alpine-pkg-glibc Also see this related thread : https://github.com/sgerrand/alpine-pkg-glibc/issues/199
I think this could help us add support for GPU (cuda)
Let me know what you think.
no gpu supported? sadly ran in to this trying to move away from docker desktop
For now until this is fixed, I have installed docker within wsl.
This doesn't provide me with the GUI and doesn't let me see my images/containers/etc. from windows, but it allows for use of docker and GPUs in containers in WSL without Rancher Desktop or Docker Desktop (or Podman Desktop).
From within WSL, uninstall old docker conflicting packages, setup apt repo, install the latest version of docker (see https://docs.docker.com/engine/install/ubuntu/ for exact commands).
Within WSL, assure you see your GPU (can do before docker install).
nvidia-smi
should return something like this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:00:1E.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Then install nvidia-container-toolkit using commands found here (https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).... configure repo with curl, apt-get update, apt install nvidia-container-toolkit. Then, configure with an nvidia-ctk command and a systemctl restart docker command. (see link for exact commands).
Then the command in WSL sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
should return the an nvidia-smi response similar to the above.
(Actually, on some of my machines, it shows a seg fault after the nvidia-sim printout, but I can access the GPUs fine.)
Problem Description
At the moment it seems like Rancher Desktop for Windows does not support Nvidida CUDA. I have tried both, the
containerd
and thedockerd
engines.Executing
nerdctl run --rm --gpus all nvidia/cuda:12.0.1-devel-ubuntu22.04 nvidia-smi
fails with:Meanwhile dockerd complais about the lack of a driver.
Proposed Solution
Rancher Desktop needs to check wether CUDA is avalible in the WSL or not, if it's available it should intall the NVIDIA Container Toolkit in rancher's WSL distro.
Additional Information
This might be tricky as the NVIDIA Container Toolkit does not Alpine, I attemped to manually install the toolkit in the rancher-desktop distro to no avail. The main roadblock it's the lack of
glibc
.gcompat
looks promising, I got the toolkit running but only for it to complain about the lack of CUDA.