NVIDIA / nvidia-container-runtime

NVIDIA container runtime
Apache License 2.0
1.11k stars 159 forks source link

Running nvidia-container-runtime with podman is blowing up. #85

Closed rhatdan closed 1 year ago

rhatdan commented 5 years ago
  1. Issue or feature description rootless and rootful podman does not work with the nvidia plugin

  2. Steps to reproduce the issue Install the nvidia plugin, configure it to run with podman execute the podman command and check if the devices is configured correctly.

  3. Information to attach (optional if deemed irrelevant)

    Some nvidia-container information: nvidia-container-cli -k -d /dev/tty info Kernel version from uname -a Fedora 30 and later Any relevant kernel output lines from dmesg Driver information from nvidia-smi -a Docker version from docker version NVIDIA packages version from dpkg -l 'nvidia' or rpm -qa 'nvidia' NVIDIA container library version from nvidia-container-cli -V NVIDIA container library logs (see troubleshooting) Docker command, image and tag used

I am reporting this based on other users complaining. This is what they said.

We discovered that the ubuntu 18.04 machine needed a configuration change to get rootless working with nvidia: "no-cgroups = true" was set in /etc/nvidia-container-runtime/config.toml Unfortunately this config change did not work on Centos 7, but it did change the rootless error to: nvidia-container-cli: initialization error: cuda error: unknown error\\n\\"\""

This config change breaks podman running from root, with the error: Failed to initialize NVML: Unknown Error

Interestingly, root on ubuntu gets the same error even though rootless works.

secondspass commented 3 years ago

I don't know if this is the right place to ask, and I can open a separate issue if needed.

I'm testing rootless Podman v3.0 with crun v0.17 on our Summit test systems at Oak Ridge (IBM Power 9 with Nvidia Tesla V100 GPUs, RHEL 8.2). We have a restriction that we can't setup and maintain the subuid/subgid mappings for each of our users in the /etc/sub[uid|gid] files. That would be a giant administrative overhead since that mapping would have to be maintained on every node. Currently pulling or building cuda containers works just fine. But when trying to run it.

% podman run --rm --security-opt=label=disable --hooks-dir=/usr/share/containers/oci/hooks.d/ oci-archive:/ccs/home/subil/subil-containers-oci/simplecuda nvidia-smi
Getting image source signatures
Copying blob 5ef3c0b978d0 done
Copying blob d23be3dac067 done
Copying blob 786d8ed1601c done
Copying blob 6e99435589e0 done
Copying blob 93d25f6f9464 done
Copying blob d1ababb2c734 done
Copying config beba83a3b2 done
Writing manifest to image destination
Storing signatures
Error: OCI runtime error: error executing hook `/usr/bin/nvidia-container-toolkit` (exit code: 1)

Here, simplecuda is just an oci-archive of docker.io/nvidia/cuda-ppc64le:10.2-base-centos7 (our HPC system uses IBM PowerPC).

The nvidia-container-toolkit.log looks like this

-- WARNING, the following logs are for debugging purposes only --

I0330 21:24:39.001988 1186667 nvc.c:282] initializing library context (version=1.3.0, build=16315ebdf4b9728e899f615e208b50c41d7a5d15)
I0330 21:24:39.002033 1186667 nvc.c:256] using root /
I0330 21:24:39.002038 1186667 nvc.c:257] using ldcache /etc/ld.so.cache
I0330 21:24:39.002043 1186667 nvc.c:258] using unprivileged user 65534:65534
I0330 21:24:39.002058 1186667 nvc.c:299] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL)
I0330 21:24:39.002241 1186667 nvc.c:301] dxcore initialization failed, continuing assuming a non-WSL environment
W0330 21:24:39.002259 1186667 nvc.c:167] skipping kernel modules load due to user namespace
I0330 21:24:39.002400 1186672 driver.c:101] starting driver service
E0330 21:24:39.002442 1186672 driver.c:161] could not start driver service: privilege change failed: operation not permitted
I0330 21:24:39.003214 1186667 driver.c:196] driver service terminated successfully

I've tried a whole variety of different Podman flag combinations mentioned earlier in this issue thread. None have worked. They all have the same errors above in the output and the log file.

I have the hook json file properly set up

% cat /usr/share/containers/oci/hooks.d/oci-nvidia-hook.json
{
    "version": "1.0.0",
    "hook": {
        "path": "/usr/bin/nvidia-container-toolkit",
        "args": ["nvidia-container-toolkit", "prestart"],
        "env": [
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
        ]
    },
    "when": {
        "always": true,
        "commands": [".*"]
    },
    "stages": ["prestart"]
}

The nvidia-container-runtime config.toml looks like this

[76a@raptor07 ~]$ cat /etc/nvidia-container-runtime/config.toml
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false

[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
debug = "/tmp/.nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
no-cgroups = true
#user = "root:video"
ldconfig = "@/sbin/ldconfig"

[nvidia-container-runtime]
debug = "/tmp/.nvidia-container-runtime.log"

My storage.conf looks like this

% cat ~/.config/containers/storage.conf
[storage]
driver = "overlay"
graphroot = "/tmp/subil-containers-peak"
rootless_storage_path = "$HOME/.local/share/containers/storage"
#rootless_storage_path = "/tmp/subil-containers-storage-peak"

[storage.options]
additionalimagestores = [
]

[storage.options.overlay]
ignore_chown_errors = "true"
mount_program = "/usr/bin/fuse-overlayfs"
mountopt = "nodev,metacopy=on"

[storage.options.thinpool]

For comparison, I also tested this on a PowerPC workstation (identical to the HPC nodes: IBM Power9 with Nvidia Tesla V100, RHEL 8.2) and it's the exact same errors there too. But once we set up the subuid/subgid mappings on the workstation and did echo “user.max_user_namespaces=28633” > /etc/sysctl.d/userns.conf, Podman was able to run the cuda container without issue.

[76a@raptor07 gpu]$ podman run  --rm docker.io/nvidia/cuda-ppc64le:10.2-base-centos7 nvidia-smi -L
GPU 0: Tesla V100-PCIE-16GB (UUID: GPU-4d2aad84-ad3d-430b-998c-6124d28d8e7c)

So I know the issue is that we need both the subuid/subgid mappings and the user.max_user_namespaces. I want to know if it is possible to get the nvidia-container-toolkit working with rootless Podman without needing the subuid/subgid mappings.

For reference, we had a related issue (https://github.com/containers/podman/issues/8580) with MPI not working because of the lack of subuid/subgid mappings. @giuseppe was able to patch crun and Podman to make that work for Podman v3 and crun >=v0.17. I wanted to know if there was something that could be done here to make the nvidia-container-toolkit also work under the same conditions.

I'm happy to provide more details if you need.

zjuwyz commented 3 years ago

I have posted this here but it seems this issue is more relavent and is still open, so I copy it here. I encountered exactly the same problem with podman 3.0.1 and nvidia-container-runtime 3.4.0-1

/usr/bin/nvidia-container-runtime: find runc path: exec: "runc": executable file not found in $PATH

After some attempts, I find out that --cap-add AUDIT_WRITE solves this problem.

2021-04-18 12-06-57屏幕截图

I have totally no idea why this would even work, though. Here's my podman info, I'm happy to offer any further detailed info if asked.

host:
  arch: amd64
  buildahVersion: 1.19.4
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: /usr/bin/conmon 由 conmon 1:2.0.27-1 所拥有
    path: /usr/bin/conmon
    version: 'conmon version 2.0.27, commit: 65fad4bfcb250df0435ea668017e643e7f462155'
  cpus: 16
  distribution:
    distribution: manjaro
    version: unknown
  eventLogger: journald
  hostname: manjaro
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.9.16-1-MANJARO
  linkmode: dynamic
  memFree: 26319368192
  memTotal: 33602633728
  ociRuntime:
    name: /usr/bin/nvidia-container-runtime
    package: /usr/bin/nvidia-container-runtime 由 nvidia-container-runtime-bin 3.4.0-1 所拥有
    path: /usr/bin/nvidia-container-runtime
    version: |-
      runc version 1.0.0-rc93
      commit: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
      spec: 1.0.2-dev
      go: go1.16.2
      libseccomp: 2.5.1
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns 由 slirp4netns 1.1.9-1 所拥有
    version: |-
      slirp4netns version 1.1.9
      commit: 4e37ea557562e0d7a64dc636eff156f64927335e
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.1
  swapFree: 0
  swapTotal: 0
  uptime: 1h 50m 44.99s (Approximately 0.04 days)
registries:
  docker.io:
    Blocked: false
    Insecure: false
    Location: hub-mirror.c.163.com
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: docker.io
  search:
  - docker.io
store:
  configFile: /home/wangyize/.config/containers/storage.conf
  containerStore:
    number: 30
    paused: 0
    running: 1
    stopped: 29
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: /usr/bin/fuse-overlayfs 由 fuse-overlayfs 1.5.0-1 所拥有
      Version: |-
        fusermount3 version: 3.10.2
        fuse-overlayfs: version 1.5
        FUSE library version 3.10.2
        using FUSE kernel interface version 7.31
  graphRoot: /home/wangyize/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 2
  runRoot: /run/user/1000/containers
  volumePath: /home/wangyize/.local/share/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1613921386
  BuiltTime: Sun Feb 21 23:29:46 2021
  GitCommit: c640670e85c4aaaff92741691d6a854a90229d8d
  GoVersion: go1.16
  OsArch: linux/amd64
  Version: 3.0.1
rhatdan commented 3 years ago

Does anyone have any idea what would require the AUDIT_WRITE capability?

qhaas commented 3 years ago

AUDIT_WRITE is a capability I'd rather not add... Looks like runc has it by default?

In the OCI/runc spec they are even more drastic only retaining, audit_write, kill, and net_bind_service
elezar commented 3 years ago

@zjuwyz @rhatdan

Looking at the error message, the nvidia-container-runtime (a simple shim for runc) is failing to find runc. This is implemented here: https://github.com/NVIDIA/nvidia-container-runtime/blob/v3.4.2/src/main.go#L96 and is due to the result of exec.LookPath failing. Internally, that is checking whether ${P}/runc exists, is not a directory, and is executable for each ${P} in the ${PATH}. This calls os.Stat and I would assume that this query would trigger an entry into the audit log.

Do you have any audit logs to confirm that this is what is causing this?

Note: at this point, no container has been created or started as the runc create command has just been intercepted and the OCI spec patched to insert the NVIDIA hook.

giuseppe commented 3 years ago

the error looks like an outdated runc that doesn't understand errnoRet: https://github.com/opencontainers/runc/pull/2424

Without support for errnoRet, runc is not able to handle: https://github.com/containers/common/blob/master/pkg/seccomp/seccomp.json#L730-L833 and the only way to disable this chunk is to add CAP_AUDIT_WRITE.

I'd try with an updated runc first and see if it can handle the seccomp configuration generated by Podman when CAP_AUDIT_WRITE is not added

secondspass commented 3 years ago

Following on my previous comment: https://github.com/NVIDIA/nvidia-container-runtime/issues/85#issuecomment-811080338

I tested out running different versions (v1.2.0, v1.3.0 and the latest v1.3.3) of nvidia-container-toolkit and libnvidia-container for rootless Podman without subuid/subgid on x86 machines as well, with identical settings and configs as I had in the PowerPC machines. The tests on x86 show the exact same issues for rootless Podman as they did on PowerPC.

elezar commented 3 years ago

@secondspass thanks for confirming that you're able to reproduce on an x86 machine. Do you have a simple way for us to reproduce this internally? This would allow us to better assess what the requirements are for getting this working.

We are in the process of reworking how the NVIDIA Container Stack works and this should address these kinds of issues, as we would make more use of the low-level runtime (crun in this case).

qhaas commented 3 years ago

Do you have a simple way for us to reproduce this internally?

While @secondspass reported bug was with CentOS 8.3, I can report that it exists in x86-64 CentOS8 Streams as well. Here is how to reproduce in centos8-streams (which has an updated podman/crun stack):

  1. Verify nvidia cuda repos and nvidia-container-toolkit repos are enabled
  2. Deploy nvidia proprietary drivers: # dnf module install nvidia-driver:465-dkms, reboot and verify nvidia-smi works
  3. Deploy podman/crun stack: # dnf install crun podman skopeo buildah slirp4netns
  4. Enable use of containers without the need for subuid/subgid (per @secondspass ):
    
    cat ~/.config/containers/storage.conf
    [storage]
    driver = "overlay"
    graphroot = "/tmp/${USER}-containers-peak"
    rootless_storage_path = "${HOME}/.local/share/containers/storage"

[storage.options] additionalimagestores = [ ]

[storage.options.overlay] ignore_chown_errors = "true" mount_program = "/usr/bin/fuse-overlayfs" mountopt = "nodev,metacopy=on"

[storage.options.thinpool]

5. Verify current user's subuid/subgid is not set since they get automatically added if one users certain CL tools to add users:

$ grep $USER /etc/subuid | wc -l 0 $ grep $USER /etc/subgid | wc -l 0

6. Verify rootless containers work (without GPU acceleration):

$ podman run --rm docker.io/centos:8 cat /etc/redhat-release CentOS Linux release 8.3.2011

7. Deploy libnvidia-container-tools:  `# dnf install nvidia-container-toolkit`
8. Modify configuration to support podman / rootless (per @secondspass and others above ):

cat /etc/nvidia-container-runtime/config.toml disable-require = false

swarm-resource = "DOCKER_RESOURCE_GPU"

accept-nvidia-visible-devices-envvar-when-unprivileged = true

accept-nvidia-visible-devices-as-volume-mounts = false

[nvidia-container-cli]

root = "/run/nvidia/driver"

path = "/usr/bin/nvidia-container-cli"

environment = [] debug = "/tmp/nvidia-container-toolkit.log"

ldcache = "/etc/ld.so.cache"

load-kmods = true no-cgroups = true

user = "root:video"

ldconfig = "@/sbin/ldconfig"

[nvidia-container-runtime] debug = "/tmp/nvidia-container-runtime.log"

9. Test rootless podman with gpu acceleration and no subuid/subgid, it fails:

$ podman run --rm --security-opt=label=disable --hooks-dir=/usr/share/containers/oci/hooks.d/ docker.io/nvidia/cuda:10.2-base-centos8 nvidia-smi -L ... Error: OCI runtime error: error executing hook /usr/bin/nvidia-container-toolkit (exit code: 1) $ cat /tmp/nvidia-container-toolkit.log

-- WARNING, the following logs are for debugging purposes only --

I0421 13:52:26.487793 6728 nvc.c:372] initializing library context (version=1.3.3, build=bd9fc3f2b642345301cb2e23de07ec5386232317) I0421 13:52:26.487987 6728 nvc.c:346] using root / I0421 13:52:26.488002 6728 nvc.c:347] using ldcache /etc/ld.so.cache I0421 13:52:26.488013 6728 nvc.c:348] using unprivileged user 65534:65534 I0421 13:52:26.488067 6728 nvc.c:389] attempting to load dxcore to see if we are running under Windows Subsystem for Linux (WSL) I0421 13:52:26.488264 6728 nvc.c:391] dxcore initialization failed, continuing assuming a non-WSL environment W0421 13:52:26.488328 6728 nvc.c:249] skipping kernel modules load due to user namespace I0421 13:52:26.488877 6733 driver.c:101] starting driver service E0421 13:52:26.489031 6733 driver.c:161] could not start driver service: privilege change failed: operation not permitted I0421 13:52:26.498449 6728 driver.c:196] driver service terminated successfully

10. (sanity check) verify it works WITH sudo:

$ sudo podman run --rm --security-opt=label=disable --hooks-dir=/usr/share/containers/oci/hooks.d/ docker.io/nvidia/cuda:10.2-base-centos8 nvidia-smi -L ... GPU 0: NVIDIA Tesla V100-PCIE-32GB (UUID: GPU-0a55d110-f8ea-4209-baa7-0e5675c7e832)


Version info for my run:

$ cat /etc/redhat-release CentOS Stream release 8 $ nvidia-smi | grep Version NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3 $ nvidia-container-cli --version version: 1.3.3 $ crun --version crun version 0.18 $ podman --version podman version 3.1.0-dev



Update:  Spun this issue off [into its own issue](https://github.com/NVIDIA/nvidia-container-runtime/issues/145)
daveman1010221 commented 3 years ago

I have posted this here but it seems this issue is more relavent and is still open, so I copy it here. I encountered exactly the same problem with podman 3.0.1 and nvidia-container-runtime 3.4.0-1

/usr/bin/nvidia-container-runtime: find runc path: exec: "runc": executable file not found in $PATH

After some attempts, I find out that --cap-add AUDIT_WRITE solves this problem.

2021-04-18 12-06-57屏幕截图

I have totally no idea why this would even work, though. Here's my podman info, I'm happy to offer any further detailed info if asked.

host:
  arch: amd64
  buildahVersion: 1.19.4
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: /usr/bin/conmon 由 conmon 1:2.0.27-1 所拥有
    path: /usr/bin/conmon
    version: 'conmon version 2.0.27, commit: 65fad4bfcb250df0435ea668017e643e7f462155'
  cpus: 16
  distribution:
    distribution: manjaro
    version: unknown
  eventLogger: journald
  hostname: manjaro
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.9.16-1-MANJARO
  linkmode: dynamic
  memFree: 26319368192
  memTotal: 33602633728
  ociRuntime:
    name: /usr/bin/nvidia-container-runtime
    package: /usr/bin/nvidia-container-runtime 由 nvidia-container-runtime-bin 3.4.0-1 所拥有
    path: /usr/bin/nvidia-container-runtime
    version: |-
      runc version 1.0.0-rc93
      commit: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
      spec: 1.0.2-dev
      go: go1.16.2
      libseccomp: 2.5.1
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: /usr/bin/slirp4netns 由 slirp4netns 1.1.9-1 所拥有
    version: |-
      slirp4netns version 1.1.9
      commit: 4e37ea557562e0d7a64dc636eff156f64927335e
      libslirp: 4.4.0
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.1
  swapFree: 0
  swapTotal: 0
  uptime: 1h 50m 44.99s (Approximately 0.04 days)
registries:
  docker.io:
    Blocked: false
    Insecure: false
    Location: hub-mirror.c.163.com
    MirrorByDigestOnly: false
    Mirrors: null
    Prefix: docker.io
  search:
  - docker.io
store:
  configFile: /home/wangyize/.config/containers/storage.conf
  containerStore:
    number: 30
    paused: 0
    running: 1
    stopped: 29
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: /usr/bin/fuse-overlayfs 由 fuse-overlayfs 1.5.0-1 所拥有
      Version: |-
        fusermount3 version: 3.10.2
        fuse-overlayfs: version 1.5
        FUSE library version 3.10.2
        using FUSE kernel interface version 7.31
  graphRoot: /home/wangyize/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 2
  runRoot: /run/user/1000/containers
  volumePath: /home/wangyize/.local/share/containers/storage/volumes
version:
  APIVersion: 3.0.0
  Built: 1613921386
  BuiltTime: Sun Feb 21 23:29:46 2021
  GitCommit: c640670e85c4aaaff92741691d6a854a90229d8d
  GoVersion: go1.16
  OsArch: linux/amd64
  Version: 3.0.1

The fact that this works and solved this problem for me as well, tells me this is a race condition.

KCSesh commented 3 years ago

I am a bit confused the current state of podman rootless with gpus.

I am on ubuntu 18.04 arm64 host.

I have made the changes to:

disable-require = false

[nvidia-container-cli]
environment = []
debug = "/tmp/nvidia-container-toolkit.log"
load-kmods = true
no-cgroups = true
ldconfig = "@/sbin/ldconfig.real"

[nvidia-container-runtime]
debug = "/tmp/nvidia-container-runtime.log"
  1. Is the change above only required on machines that are using cgroups v2 ?

I am only able to get GPU access if I run podman with sudo and --privileged (I need both: Update: See comment below). So far have found no other way to run podman with GPU access, even with the above cgroups change, my root does not break.

  1. What does this mean if my root is not breaking with the cgroup change?

When I run rootless, I see the following error:

Error: OCI runtime error: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"process_linux.go:432: running prestart hook 0 caused \\\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods --debug=/dev/stderr configure --ldconfig=@/sbin/ldconfig.real --device=all --utility --pid=20052 /data/gpu/rootfs]\\\\n\\\\n-- WARNING, the following logs are for debugging purposes only --\\\\n\\\\nI0427 00:54:00.184026 20064 nvc.c:281] initializing library context (version=0.9.0+beta1, build=77c1cbc2f6595c59beda3699ebb9d49a0a8af426)\\\\nI0427 00:54:00.184272 20064 nvc.c:255] using root /\\\\nI0427 00:54:00.184301 20064 nvc.c:256] using ldcache /etc/ld.so.cache\\\\nI0427 00:54:00.184324 20064 nvc.c:257] using unprivileged user 65534:65534\\\\nI0427 00:54:00.184850 20069 driver.c:134] starting driver service\\\\nI0427 00:54:00.192642 20064 driver.c:231] driver service terminated with signal 15\\\\nnvidia-container-cli: initialization error: cuda error: no cuda-capable device is detected\\\\n\\\"\""

I have tried with --security-opt=label=disable and have seen no changes in behavior.

  1. It is unclear to me what runtime people are using. Are they using standard runc or /usr/bin/nvidia-container-runtime I have tried both, and both do not work in rootless, and both work in root with privileged.
giuseppe commented 3 years ago

does it make any difference if you bind mount /dev from the host?

qhaas commented 3 years ago

security-opt=label=disable

I'm not very fluent in Ubuntu IT, but I believe that command targets SELinux. Ubuntu uses Apparmor for mandatory access control (MAC). So wouldn't the equivalent command be --security-opt 'apparmor=unconfined'?

KCSesh commented 3 years ago

First a slight update and correction to the above, I don't actually need --privileged I just need to define -e NVIDIA_VISIBLE_DEVICES=all and this invokes the nvidia hook, which works with sudo. Rootless is not fixed still.

does it make any difference if you bind mount /dev from the host?

@giuseppe It does not fix rootless, but in rootful podman using sudo this makes the hook no longer required, which makes sense.

The issue with rootless is that I can't mount all of /dev/

Error: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"open /dev/console: permission denied\"": OCI permission denied

So I did the next best thing and attempted to mount all of the nv* devices under /dev/. I tried 2 ways one with -v and the other using the --device flag and adding in the nvidia components. That does not allow the rootless container to detect the GPUS still! It does work using rootful podman using sudo.

The difference is when it is mapped with sudo, I actually see the devices belong to the root:video. Whereas in rootless mode, I only see nobody:nogroup

I am wondering if it is related to the video group? The error I get in rootless mode is the following when trying to run CUDA code:

Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading == false: CUDA: no CUDA-capable device is detected

When I look in the container under /dev in the rootless container:

$ls -la /dev
...
crw-rw----  1 nobody nogroup 505,  1 Apr 26 18:32 nvhost-as-gpu
...

For ALL of the nv* devices in the rootless container, they don't have a user/group mapped.

In the rootful container that uses sudo:

...
crw-rw----  1 root video   505,   1 Apr 26 18:32 nvhost-as-gpu
...

For ALL of the nv* devices in the rootful container that uses sudo, they have root:video

So I am pretty certain I need video mapped into the container. But am unclear on how to do this. I have mapped in the video group with --group-add as a test, but I believe I also need to use --gidmap because even with group-add it still shows as nogroup.

My understanding of the user/group mapping podman does is a little fuzzy so I will take suggestions on how to do this 😄

Let me know what you think @giuseppe

security-opt=label=disable

I'm not very fluent in Ubuntu IT, but I believe that command targets SELinux. Ubuntu uses Apparmor for mandatory access control (MAC). So wouldn't the equivalent command be --security-opt 'apparmor=unconfined'?

@qhaas Excellent point. That explains why that flag seems to be a no-op for me. Also my current system, at least right now does not have apparmor loaded so I shouldn't need either of those flags. I tired it though just for sanity, and confirmed no difference in behavior. Thank you!

If you have any suggestions on gid mappings please let me know!

giuseppe commented 3 years ago

First a slight update and correction to the above, I don't actually need --privileged I just need to define -e NVIDIA_VISIBLE_DEVICES=all and this invokes the nvidia hook, which works with sudo. Rootless is not fixed still.

does it make any difference if you bind mount /dev from the host?

@giuseppe It does not fix rootless, but in rootful podman using sudo this makes the hook no longer required, which makes sense.

The issue with rootless is that I can't mount all of /dev/

could you use -v /dev:/dev --mount type=devpts,destination=/dev/pts ?

KCSesh commented 3 years ago

@giuseppe I tried adding: -v /dev:/dev --mount type=devpts,destination=/dev/pts

And got the following error:

DEBU[0004] ExitCode msg: "container create failed (no logs from conmon): eof"
Error: container create failed (no logs from conmon): EOF

Not sure how to enable more logs in conmon

If I switch to use: --runtime /usr/local/bin/crun with -v /dev:/dev --mount type=devpts,destination=/dev/pts

I get the following error:

Error: OCI runtime error: error executing hook `/usr/bin/nvidia-container-toolkit` (exit code: 1)

From previous encounters with this error, the way I understand this message is that the video group that is apart of the mount location, is not being mapped into the container correctly.

Just an FYI, I was able to get rootless podman to access the GPU if I added my user to the video group and used the runtime crun. More details here: https://github.com/containers/podman/issues/10166

I am still interested in a path forward without adding my user to the video group, but this is a good progress step.

KCSesh commented 3 years ago

Just as an update that have been posted in the https://github.com/containers/podman/issues/10166

I have been able to access my GPU as a rootless user that belongs to the video group, using the nvidia hook:

cat /data/01-nvhook.json
{
  "version": "1.0.0",
  "hook": {
    "path": "/usr/bin/nvidia-container-toolkit",
    "args": ["nvidia-container-toolkit", "prestart"],
    "env": ["NVIDIA_REQUIRE_CUDA=cuda>=10.1"]
  },
  "when": {
    "always": true
  },
  "stages": ["prestart"]
}

But also this one seems to work as well:

cat /data/01-nvhook-runtime-hook.json
{
  "version": "1.0.0",
  "hook": {
    "path": "/usr/bin/nvidia-container-runtime-hook",
    "args": ["/usr/bin/nvidia-container-runtime-hook", "prestart"],
    "env": []
  },
  "when": {
    "always": true
  },
  "stages": ["prestart"]
}

Separately, without hooks I was able to use the --device mounts and access my GPU as well.

The important steps that had to be taken here was:

  1. Rootless user needs to belong to the video group.
  2. Use Podman flags --group-add keep-groups (This correctly maps the video group into the container. )
  3. Use crun and not runc because crun is the only runtime that supports --group-add keep-groups

I have a related issue here : https://github.com/containers/podman/issues/10212 to get this working in C++ with execv and am seeing an odd issue.

Ru13en commented 3 years ago

Hi, I've been using containers with acess to GPUs, however i've been noted that for each reboot i need to run allways before starting the 1st container: nvidia-smi

otherwise i get the error: Error: error executing hook `/usr/bin/nvidia-container-toolkit` (exit code: 1): OCI runtime error

After that i also need to run the NVIDIA Device Node Verification script to proper startup the /dev/nvidia-uvm for CUDA applications as described in this post: https://github.com/tensorflow/tensorflow/issues/32623#issuecomment-533936509

Just to share my HW configuration that works (only with --privileged tag) on root and rootless:

NAME="CentOS Linux"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"
CentOS Linux release 8.3.2011
CentOS Linux release 8.3.2011

getenforce: Enforcing

podman info:

  arch: amd64
  buildahVersion: 1.20.1
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.27-1.el8.1.5.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.27, commit: '
  cpus: 80
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: journald
  hostname: turing
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 2002
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 2002
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 4.18.0-240.22.1.el8_3.x86_64
  linkmode: dynamic
  memFree: 781801324544
  memTotal: 809933586432
  ociRuntime:
    name: crun
    package: crun-0.19.1-2.el8.3.1.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.19.1
      commit: 1535fedf0b83fb898d449f9680000f729ba719f5
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/2002/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    selinuxEnabled: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.8-4.el8.7.6.x86_64
    version: |-
      slirp4netns version 1.1.8
      commit: d361001f495417b880f20329121e3aa431a8f90f
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.4.3
  swapFree: 42949668864
  swapTotal: 42949668864
  uptime: 29h 16m 48.14s (Approximately 1.21 days)
registries:
  search:
  - docker.io
  - quay.io
store:
  configFile: /home/user/.config/containers/storage.conf
  containerStore:
    number: 29
    paused: 0
    running: 0
    stopped: 29
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.5.0-1.el8.5.3.x86_64
      Version: |-
        fusermount3 version: 3.2.1
        fuse-overlayfs: version 1.5
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  graphRoot: /home/user/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 28
  runRoot: /run/user/2002/containers
  volumePath: /home/user/.local/share/containers/storage/volumes
version:
  APIVersion: 3.1.2
  Built: 1619185402
  BuiltTime: Fri Apr 23 14:43:22 2021
  GitCommit: ""
  GoVersion: go1.14.12
  OsArch: linux/amd64
  Version: 3.1.2
nvidia-smi | grep Version
NVIDIA-SMI 465.19.01 Driver Version: 465.19.01 CUDA Version: 11.3

cat /etc/nvidia-container-runtime/config.toml

disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false

[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
no-cgroups = true
#user = "root:video"
ldconfig = "@/sbin/ldconfig"
elezar commented 3 years ago

Hi @Ru13en the issue you described above seems to be a different one to what is being discussed here. Would you mind moving this to a separate GitHub issue? (I would assume this is because the nvidia container toolkit cannot load the kernel modules if it does not have the required permissions. Running nvidia-smi loads the kernel modules and also ensures that the device nodes are created).

Ru13en commented 3 years ago

@elezar Thanks, i've opened another issue: https://github.com/NVIDIA/nvidia-container-runtime/issues/142

fuomag9 commented 3 years ago

For anybody who has the same issue as me ("nvidia-smi": executable file not found in $PATH: OCI not found or no NVIDIA GPU device is present: /dev/nvidia0 does not exist, this is how I made it work on kubuntu 21.04 rootless:

Add your user to group video if not present: usermod -a -G video $USER

/usr/share/containers/oci/hooks.d/oci-nvidia-hook.json:

{
  "version": "1.0.0",
  "hook": {
    "path": "/usr/bin/nvidia-container-runtime-hook",
    "args": ["/usr/bin/nvidia-container-runtime-hook", "prestart"],
    "env": []
  },
  "when": {
    "always": true
  },
  "stages": ["prestart"]
}

/etc/nvidia-container-runtime/config.toml:

disable-require = false

[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-runtime-hook.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
no-cgroups = true
#user = "root:video"
ldconfig = "@/sbin/ldconfig.real"

podman run -it --group-add video docker.io/tensorflow/tensorflow:latest-gpu-jupyter nvidia-smi

Sun Jul 18 11:45:06 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 465.31       Driver Version: 465.31       CUDA Version: 11.3     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:09:00.0  On |                  N/A |
| 31%   43C    P8     6W / 215W |   2582MiB /  7979MiB |      9%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+
helson73 commented 3 years ago

@rhatdan @nvjmayo Turns out that getting rootless podman working with nvidia on centos 7 is a bit more complicated, at least for us.

Here is our scenario on brand new centos 7.7 machine

  1. run nvidia-smi with rootless podman 1.result: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused "process_linux.go:413: running prestart hook 0 caused \"error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: cuda error: unknown error\n\""
  2. run podman with user=root 2.result: nvidia-smi works
  3. run podman rootless 3.result: nvidia-smi works!
  4. reboot machine, run podman rootless 4.result: fails again with same error as Plugin requirements #1

Conclusion: running nvidia container with podman as root changes the environment for rootless to work. Environment cleared on reboot.

One other comment: podman as root and rootless podman cannot run with the same /etc/nvidia-container-runtime/config.toml - no-cgroups must =false for root and =true for rootless

Hi, have you figure out the solution? I have exactly the same symptom as yours.

Rootless running only works after launching container as root at least once. And reboot reset everything. I am using RHEL 8.4 and can't believe this still happens after one year ...

qhaas commented 3 years ago

For those dropping into this issue, nvidia has documented getting GPU acceleration working with podman.

fuomag9 commented 3 years ago

For those dropping into this issue, nvidia has documented getting GPU acceleration working with podman.

That's awesome! The documentation is almost the same as my fix here in this thread :D

rhatdan commented 3 years ago

any chance they can update the version of podman in example. That one is pretty old.

KCSesh commented 3 years ago

@fuomag9 Are you using crun as opposed to runc out of curiosity? Does it work with both in rootless for you? Or just crun?

fuomag9 commented 3 years ago

@fuomag9 Are you using crun as opposed to runc out of curiosity? Does it work with both in rootless for you? Or just crun?

Working for me with both runc and crun set via /etc/containers/containers.conf with runtime = "XXX"

qhaas commented 2 years ago

--hooks-dir /usr/share/containers/oci/hooks.d/ does not seem to be needed anymore, at least with podman 3.3.1 and nvidia-container-toolkit 1.7.0.

For RHEL8 systems where selinux is enforcing, it it 'best practice' to add the nvidia selinux policy module and run podman with --security-opt label=type:nvidia_container_t (per RH documentation, even on non-DGX systems) or just run podman with --security-opt=label=disable (per nvidia documentation)? Unclear if there is any significant benefit to warrant messing with SELinux policy.

decandia50 commented 2 years ago

For folks finding this issue, especially anyone trying to do this on RHEL8 after following https://www.redhat.com/en/blog/how-use-gpus-containers-bare-metal-rhel-8, here's the current status/known issues that I've encountered. Hopefully this saves someone some time.

As noted in the comments above you can run containers as root without issue, but if you try to use --userns keep-id you're going to have a bad day.

Things that need to be done ahead of time to run rootless containers are documented in https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#step-3-rootless-containers-setup but the cheat sheet version is:

  1. Install nvidia-container-toolkit
  2. Update /etc/nvidia-container-runtime/config.toml and set no-cgroups = true
  3. Use NVIDIA_VISIBILE_DEVICES as part of your podman environment.
  4. Specify --hooks-dir=/usr/share/containers/oci/hooks.d/ (may not strictly be needed).

If you do that, then running: podman run -e NVIDIA_VISIBLE_DEVICES=all --hooks-dir=/usr/share/containers/oci/hooks.d/ --rm -ti myimage nvidia-smi should result in the usual nvidia-smi output. But, you'll note that the user in the container is root and that may not be what you want. If you use --userns keep-id; e.g. podman run --userns keep-id -e NVIDIA_VISIBLE_DEVICES=all --hooks-dir=/usr/share/containers/oci/hooks.d/ --rm -ti myimage nvidia-smi you will get an error that states: Error: OCI runtime error: crun: error executing hook /usr/bin/nvidia-container-toolkit (exit code: 1). From my reading above the checks that are run require the user to be root in the container.

Now for the workaround. You don't need this hook, you just need the nvidia-container-cli tool. All the hook really does is mount the correct libraries, devices, and binaries from the underlying system into the container. We can use nvidia-container-cli -k list and find to accomplish the same thing. Here's my one-liner below. Note that I'm excluding both -e NVIDIA_VISIBILE_DEVICES=all and --hooks-dir=/usr/share/containers/oci/hooks.d/.

Here's what it looks like: podman run --userns keep-id $(for file in $(nvidia-container-cli -k list); do find -L $(dirname $file) -xdev -samefile $file; done | awk '{print " -v "$1":"$1}' | xargs) --rm -ti myimage nvidia-smi

This is what the above is doing. We run nvidia-container-cli -k list which on my system produces output like:

$ nvidia-container-cli -k list
/dev/nvidiactl
/dev/nvidia-uvm
/dev/nvidia-uvm-tools
/dev/nvidia-modeset
/dev/nvidia0
/dev/nvidia1
/usr/bin/nvidia-smi
/usr/bin/nvidia-debugdump
/usr/bin/nvidia-persistenced
/usr/bin/nvidia-cuda-mps-control
/usr/bin/nvidia-cuda-mps-server
/usr/lib64/libnvidia-ml.so.470.141.03
/usr/lib64/libnvidia-cfg.so.470.141.03
/usr/lib64/libcuda.so.470.141.03
/usr/lib64/libnvidia-opencl.so.470.141.03
/usr/lib64/libnvidia-ptxjitcompiler.so.470.141.03
/usr/lib64/libnvidia-allocator.so.470.141.03
/usr/lib64/libnvidia-compiler.so.470.141.03
/usr/lib64/libnvidia-ngx.so.470.141.03
/usr/lib64/libnvidia-encode.so.470.141.03
/usr/lib64/libnvidia-opticalflow.so.470.141.03
/usr/lib64/libnvcuvid.so.470.141.03
/usr/lib64/libnvidia-eglcore.so.470.141.03
/usr/lib64/libnvidia-glcore.so.470.141.03
/usr/lib64/libnvidia-tls.so.470.141.03
/usr/lib64/libnvidia-glsi.so.470.141.03
/usr/lib64/libnvidia-fbc.so.470.141.03
/usr/lib64/libnvidia-ifr.so.470.141.03
/usr/lib64/libnvidia-rtcore.so.470.141.03
/usr/lib64/libnvoptix.so.470.141.03
/usr/lib64/libGLX_nvidia.so.470.141.03
/usr/lib64/libEGL_nvidia.so.470.141.03
/usr/lib64/libGLESv2_nvidia.so.470.141.03
/usr/lib64/libGLESv1_CM_nvidia.so.470.141.03
/usr/lib64/libnvidia-glvkspirv.so.470.141.03
/usr/lib64/libnvidia-cbl.so.470.141.03
/lib/firmware/nvidia/470.141.03/gsp.bin

We then loop through each of those files and run find -L $(dirname $file) -xdev -samefile $file That finds all the symlinks to a given file. e.g.

find -L /usr/lib64 -xdev -samefile /usr/lib64/libnvidia-ml.so.470.141.03
/usr/lib64/libnvidia-ml.so.1
/usr/lib64/libnvidia-ml.so.470.141.03
/usr/lib64/libnvidia-ml.so

We loop through each of those files and use awk and xargs to create the podman cli arguments to bind mount these files into the container; e.g. -v /usr/lib64/libnvidia-ml.so.1:/usr/lib64/libnvidia-ml.so.1 -v /usr/lib64/libnvidia-ml.so.470.141.03:/usr/lib64/libnvidia-ml.so.470.141.03 -v /usr/lib64/libnvidia-ml.so:/usr/lib64/libnvidia-ml.so etc.

This effectively does what the hook does, using the tools the hook provides, but does not require the user running the container to be root, and does not require the user inside of the container to be root.

Hopefully this saves someone else a few hours.

baude commented 2 years ago

@decandia50 excellent information! your information really deserves to be highlighted. would you consider posting as a blog if we connect you with some people?

klueska commented 2 years ago

Please do not write a blog post with the above information. While the procedure may work on some setups, it is not a supported use of the nvidia-container-cli tool and will only work correctly und a very narrow set of assumptions.

The better solution is to use podman's integrated CDI support to have podman do the work that libnvidia-container would have otherwise done instead. The future of the nvidia stack (and device support in container runtimes in general) is CDI, and starting to use this method now will future proof how you access generic devices in the future.

Please see below for details on CDI: https://github.com/container-orchestrated-devices/container-device-interface

We have spent the last year rearchitecting the NVIDIA container stack to work together with CDI, and as part of this have a tool coming out with the next release that will be able to generate CDI specs for nvidia devices for use with podman (and any other CDI compatible runtimes).

In the meantime, you can generate a CDI spec manually, or wait for @elezar to comment on a better method to get a CDI spec generated today.

klueska commented 2 years ago

Here is an example of a (fully functional) CDI spec on my DGX-A100 machine (excluding MIG devices):

cdiVersion: 0.4.0
kind: nvidia.com/gpu
containerEdits:
  hooks:
  - hookName: createContainer
    path: /usr/bin/nvidia-ctk
    args:
    - /usr/bin/nvidia-ctk
    - hook
    - update-ldcache
    - --folder
    - /usr/lib/x86_64-linux-gnu
  deviceNodes:
  - path: /dev/nvidia-modeset
  - path: /dev/nvidiactl
  - path: /dev/nvidia-uvm
  - path: /dev/nvidia-uvm-tools
  mounts:
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-compiler.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libcuda.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libcuda.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvcuvid.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvcuvid.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-rtcore.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvoptix.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvoptix.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-eglcore.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glcore.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-tls.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glsi.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-fbc.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-ifr.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libEGL_nvidia.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libGLESv2_nvidia.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libGLESv1_CM_nvidia.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-glvkspirv.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-cbl.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGL.so.1.0.0
    hostPath: /usr/lib/x86_64-linux-gnu/libGL.so.1.0.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libEGL.so.1.0.0
    hostPath: /usr/lib/x86_64-linux-gnu/libEGL.so.1.0.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLESv1_CM.so.1.0.0
    hostPath: /usr/lib/x86_64-linux-gnu/libGLESv1_CM.so.1.0.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libGLESv2.so.2.0.0
    hostPath: /usr/lib/x86_64-linux-gnu/libGLESv2.so.2.0.0
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.460.91.03
    hostPath: /usr/lib/x86_64-linux-gnu/libnvidia-ngx.so.460.91.03
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-smi
    hostPath: /usr/bin/nvidia-smi
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-debugdump
    hostPath: /usr/bin/nvidia-debugdump
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-persistenced
    hostPath: /usr/bin/nvidia-persistenced
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-cuda-mps-control
    hostPath: /usr/bin/nvidia-cuda-mps-control
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /usr/bin/nvidia-cuda-mps-server
    hostPath: /usr/bin/nvidia-cuda-mps-server
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /var/run/nvidia-persistenced/socket
    hostPath: /var/run/nvidia-persistenced/socket
    options:
    - ro
    - nosuid
    - nodev
    - bind
  - containerPath: /var/run/nvidia-fabricmanager/socket
    hostPath: /var/run/nvidia-fabricmanager/socket
    options:
    - ro
    - nosuid
    - nodev
    - bind
devices:
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia0
  name: gpu0
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia1
  name: gpu1
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia2
  name: gpu2
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia3
  name: gpu3
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia4
  name: gpu4
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia5
  name: gpu5
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia6
  name: gpu6
- containerEdits:
    deviceNodes:
    - path: /dev/nvidia7
  name: gpu7
decandia50 commented 2 years ago

@elezar Can you comment on the availability of a tool to generate the CDI spec as proposed by @klueska? I'm happy to use CDI if that's the way forward. Also happy to beta test a tool if you point me towards something.

Ru13en commented 2 years ago

Maintaining a nvidia.json CDI spec file for multiple machines with different Nvidia drivers and other libs is a bit painful. For instance, NVIDIA driver installer should create libnvidia-compiler.so symlink to libnvidia-compiler.so.460.91.03, etc... The CDI nvidia.json will just take the symlinks to avoid the manual setting of all mappings for a particular driver version... I am already using CDI specs in our machines but I would like to test a tool to generate de CDI spec for any system...

elezar commented 2 years ago

@Ru13en we have a WIP Merge Request that adds an:

nvidia-ctk info generate-cdi

command to the NVIDIA Container Toolkit. This idea being that this could be run at boot or triggered on a driver installation / upgrade. We are working at getting an v1.12.0-rc.1 out that includes this functionality for early testing and feedback.

starry91 commented 2 years ago

@elezar Any ETA by when we can expect the v1.12.0-rc.1 version?

elezar commented 2 years ago

It will be released next week.

starry91 commented 2 years ago

I tried using the WIP version of nvidia-ctk(from the master branch of https://gitlab.com/nvidia/container-toolkit/container-toolkit) and was able to get it working with rootless podman, but not without issues. I have documented them in https://gitlab.com/nvidia/container-toolkit/container-toolkit/-/issues/8. @rhatdan The upcoming version of nvidia cdi generator will be using cdi version 0.5.0 while the latest podman version 4.2.0 still uses 0.4.0. Any idea when 4.3.0 might be available? (I see that 4.3.0-rc1 uses 0.5.0)

elezar commented 2 years ago

Thanks for the confirmation @starry91. The official release of v1.12.0-rc.1 has been delayed a little but thanks for testing the toolking nonetheless. I will have a look a the issue you created and update the tooling before releasing the rc.

elezar commented 1 year ago

We have recently updated our Podman support and now recommend using CDI -- which is supported natively in more recent Podman versions.

See https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#configuring-podman for details.