rancher-sandbox / rancher-desktop

Container Management and Kubernetes on the Desktop
https://rancherdesktop.io
Apache License 2.0
5.84k stars 272 forks source link

CUDA support (WSL) #3968

Open claudio4 opened 1 year ago

claudio4 commented 1 year ago

Problem Description

At the moment it seems like Rancher Desktop for Windows does not support Nvidida CUDA. I have tried both, the containerd and the dockerd engines.

Executing nerdctl run --rm --gpus all nvidia/cuda:12.0.1-devel-ubuntu22.04 nvidia-smi fails with:

> nerdctl run --rm --gpus all nvidia/cuda:12.0.1-devel-ubuntu22.04 nvidia-smi
FATA[0000] exec: "nvidia-container-cli": executable file not found in $PATH

Meanwhile dockerd complais about the lack of a driver.

> docker run --rm --gpus all  nvidia/cuda:12.0.1-devel-ubuntu22.04 nvidia-smi
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

Proposed Solution

Rancher Desktop needs to check wether CUDA is avalible in the WSL or not, if it's available it should intall the NVIDIA Container Toolkit in rancher's WSL distro.

Additional Information

This might be tricky as the NVIDIA Container Toolkit does not Alpine, I attemped to manually install the toolkit in the rancher-desktop distro to no avail. The main roadblock it's the lack of glibc. gcompat looks promising, I got the toolkit running but only for it to complain about the lack of CUDA.

rne1223 commented 1 year ago

I believe that the current problem comes from the fact that Racher Destop's docker runs on top of busybox and all the drivers that Nvidia has put out are based on Ubuntu. Is there a way to run Rancher Desktop's docker daemon on Ubuntu 20.04?

jandubois commented 1 year ago

I believe that the current problem comes from the fact that Racher Destop's docker runs on top of busybox and all the drivers that Nvidia has put out are based on Ubuntu.

Yes. Alpine uses musl and Ubuntu uses glibc. You can install a glibc compatibility library on Alpine, but I don't know if this will give you CUDA.

The best I can find is https://arto.s3.amazonaws.com/notes/cuda. If you try this out and get CUDA running with Rancher Desktop, then please leave a note here with what you did!

Is there a way to run Rancher Desktop's docker daemon on Ubuntu 20.04?

No, this is not possible. Rancher Desktop makes specific assumptions about the VM images being used; they are custom-built for Rancher Desktop.

shikanime commented 1 year ago

Is there any contribution documentation I can consult to estimate the possibility of adding support for Ubuntu as an alternative backend to Busybox?

RadicalAcronym commented 6 months ago

I would also like to see Rancher Desktop on windows support GPUs. I am able to do this with Podman Desktop. For Podman Desktop I opened a shell prompt in the running container, e.g., podman-machine-default, and ran the following:

curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo |   sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo yum install -y nvidia-container-toolkit
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
nvidia-ctk cdi list

I wanted to try that with Rancher Desktop, but I didn't work get it to work -- I suppose in part because Rancher Desktop is based on an alpine linux image which uses musl. I found a glibc package and installed it, but ran into a problem that I can't remember now. Maybe it was that I still didn't have a nvidia-container-toolkit installation for alpine. I decided to keep using podman for now.

It seems there are a few possible solutions: (1) get nvidia to write a container-toolkit for alpine linux, (2) find the right way to install glibc and nvidia-container-toolkit in alpine, (3) change Rancher Desktop to use e.g., debian-slim.

jandubois commented 6 months ago

It seems there are a few possible solutions: (1) get nvidia to write a container-toolkit for alpine linux,

This seems quite unlikely.

(2) find the right way to install glibc and nvidia-container-toolkit in alpine,

This would be the best short/medium term plan. I don't know if this is possible at all, but worth trying.

(3) change Rancher Desktop to use e.g., debian-slim.

This is not going to happen in the medium term (i.e. in 2024). I don't want to rule it out completely, but it would be a significant effort to do it right, and there is a lot of internal refactoring needed before we would attempt this.

choigawoon commented 2 months ago

i'd like to make kubernetes environment with rancher desktop`s k3s but it makes me use docker desktop. nowadays, gpu really needed. really sad.