Closed qhaas closed 11 months ago
The above used runc
, retried with crun
, works fine without GPU acceleration, but still fails to run with it without subuid being set. Logs attached as nct_fails_crun_log.txt
$ grep 'runtime =' /usr/share/containers/containers.conf
runtime = "crun"
#runtime = "runc"
$ podman run --rm --security-opt=label=disable --hooks-dir=/usr/share/containers/oci/hooks.d/ docker.io/nvidia/cuda:10.2-base-centos8 nvidia-smi -L
Error: OCI runtime error: error executing hook `/usr/bin/nvidia-container-toolkit` (exit code: 1)
$ podman run --rm docker.io/centos:8 cat /etc/redhat-release
CentOS Linux release 8.3.2011
As an alterative, created this issue over on the podman GitHub to see if Singularity's approach to GPU acceleration is applicable to podman.
We have recently reworked our podman support and now suggest using CDI to request devices. Please see the updated documentation and feel free to open a new issue against https://github.com/NVIDIA/nvidia-container-toolkit if problems persist.
Given how Issue #85 is diverging in different directions and is becoming a catchall for all things podman, thought I'd break the issue described in this comment out into its own issue... In certain situations (e.g. podman issue 8580), it is not practical to setup subuid / subgid for each user, so we'd like to try to get GPU acceleration working without having to do such, of which singularity is capable
Test System (using the container-tools:3.0 appstream):
nvdia-container-runtime config (note that
no-cgroups
is now true and debug files are going to/tmp
, per Issue #85):podman storage config (per Issue #85 and rootless podman guide):
With subuid / subgid set, things work fine, logs posted as nct_works_log.txt
Without subuid / subgid set, GPU acceleration fails, but non GPU acceleration works. Lots posted as nct_fails_log.txt
Per suggestions online, I added the account without subuid / subgid to the
video
group, that did not help. I'm also not clear on the implications of adding a user to thevideo
group, so I asked over on the nvidia forums