NVIDIA / nvidia-container-runtime

NVIDIA container runtime
Apache License 2.0
1.1k stars 159 forks source link

Add last stable Debian OS support #151

Closed comassky closed 2 years ago

comassky commented 3 years ago

Hi,

Debian 11 has just been released but nvidia-docker does not support it yet.

Is it possible to add it in the list of supported OS ?

elezar commented 2 years ago

Hi @comassky. The lack of direct packages for debian 11 is due to cgroupv2 being used as the default. The NVIDIA container stack does not work with the default configuration on these platforms. There is a workaround that can be used (see https://github.com/NVIDIA/libnvidia-container/issues/111).

If you are able to get this workaround working, please close this issue. Improving the situation is on our roadmap.

See also #152 #152

comassky commented 2 years ago

Hi, workaround works ;)

klueska commented 2 years ago

We now have an RC of libnvidia-container out that adds support for cgroupv2.

If you would like to try it out, make sure and add the experimental repo to your apt sources and install the latest packages:

For DEBs

sudo sed -i -e '/experimental/ s/^#//g' /etc/apt/sources.list.d/libnvidia-container.list
sudo apt-get update
sudo apt-get install -y libnvidia-container-tools libnvidia-container1

For RPMs

sudo yum-config-manager --enable libnvidia-container-experimental
sudo yum install -y libnvidia-container-tools libnvidia-container1
klueska commented 2 years ago

libnvidia-container-1.8.0-rc.2 is now live with some minor updates to fix some edge cases around cgroupv2 support. Assuming you followed the above, a simple update --> install should give you the latest.

Note: This does not directly add debian11 support, but you can point to the debian10 repo and install from there for now.

klueska commented 2 years ago

libnvidia-container-1.8.0 with cgroupv2 support is now GA

Release notes here: https://github.com/NVIDIA/libnvidia-container/releases/tag/v1.8.0

klueska commented 2 years ago

Debian 11 support has now been added such that running the following should now work as expected:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list