canonical / docker-snap

https://snapcraft.io/docker
MIT License
54 stars 27 forks source link

unknown or invalid runtime name: nvidia #180

Closed sfo closed 1 month ago

sfo commented 2 months ago

According to the documentation, the following should work with no errors:

$ sudo snap install docker
docker 24.0.5 from Canonical✓ installed
$ sudo docker run -it --rm --runtime=nvidia tensorflow/tensorflow:latest-gpu python
docker: Error response from daemon: unknown or invalid runtime name: nvidia.
See 'docker run --help'.
jocado commented 1 month ago

Hi @sfo

Not much info to go on there.

Is this on Ubuntu Core or Ubuntu Server/Desktop ? What revision of the snap are you running [ the stable revision was changed on the 3rd Sep, so it's not possible to say for sure which it is from the date ] ?

The current version in stable should work [ rev 2932 ].

The output of snap logs -n 100 docker.nvidia-container-toolkit could be useful if you are still having the issue.

sfo commented 1 month ago

Running an update of the full system (apt, snap, docker images) after vacation solved the issue.

FYI:

Ubuntu Desktop

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04.1 LTS
Release:        24.04
Codename:       noble

Kernel

$ uname -a
Linux orodruin 6.8.0-45-generic #45-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 30 12:02:04 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

GPU Driver

$ nvidia-smi
Thu Sep 26 10:04:35 2024       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
[...]

Snap

$ snap --version
snap    2.63.1+24.04
snapd   2.63.1+24.04
series  16
ubuntu  24.04
kernel  6.8.0-45-generic

Docker

$ snap info docker
name:      docker
summary:   Docker container runtime
publisher: Canonical**
store-url: https://snapcraft.io/docker
contact:   https://github.com/docker-snap/docker-snap/issues?q=
license:   (Apache-2.0 AND MIT AND GPL-2.0)
description: |
  Build and run container images with Docker.

  **Usage**

  * This build can only access files in the home directory. So Dockerfiles
  and all other files used in commands like `docker build`, `docker save` and
  `docker load` need to be in $HOME.
  * You can change the configuration of this build by modifying the files in
  `/var/snap/docker/current/`.
  * Additional certificates used by the Docker daemon to authenticate with
  registries need to be added in
  `/var/snap/docker/current/etc/docker/certs.d` (instead of
  `/etc/docker/certs.d`). This directory can be accessed by other snaps using
  the `docker-registry-certificates` content interface.

  **Running Docker as normal user**

  By default, Docker is only accessible with root privileges (`sudo`). If you
  want to use docker as a regular user, you need to add your user to the
  `docker` group.

      sudo addgroup --system docker
      sudo adduser $USER docker
      newgrp docker
      sudo snap disable docker
      sudo snap enable docker

  **Warning:** if you add your user to the `docker` group, it will have
  similar power as the `root` user. For details on how this impacts security
  in your system, see
  https://docs.docker.com/engine/security/#docker-daemon-attack-surface

  **Authors**

  This snap is built by Canonical based on source code published by Docker,
  Inc. It is not endorsed or published by Docker, Inc.

  Docker and the Docker logo are trademarks or registered trademarks of
  Docker, Inc. in the United States and/or other countries. Docker, Inc. and
  other parties may also have trademark rights in other terms used herein.
commands:
  - docker.compose
  - docker
  - docker.help
services:
  docker.dockerd:                  simple, enabled, active
  docker.nvidia-container-toolkit: oneshot, enabled, inactive
snap-id:      sLCsFAO8PKM5Z0fAKNszUOX0YASjQfeZ
tracking:     latest/stable
refresh-date: 22 days ago, at 11:18 CEST
channels:
  latest/stable:    24.0.5   2024-09-03 (2932) 138MB -
  latest/candidate: 24.0.5   2024-09-03 (2932) 138MB -
  latest/beta:      27.2.0   2024-09-19 (2963) 146MB -
  latest/edge:      27.2.0   2024-09-20 (2969) 146MB -
  core18/stable:    20.10.17 2023-03-13 (2746) 146MB -
  core18/candidate: ^                                
  core18/beta:      ^                                
  core18/edge:      ^                                
installed:          24.0.5              (2932) 138MB -

Tensorflow Image

$ docker images
REPOSITORY              TAG          IMAGE ID       CREATED        SIZE
tensorflow/tensorflow   latest-gpu   30369aa97d26   2 months ago   7.4GB