Open zvier opened 2 years ago
Note that the following command doesn't use the same code path for injecting GPUs as what K8s does.
ctr run --rm --gpus 0 -t docker.io/nvidia/cuda:11.0.3-base-ubuntu20.04 cuda-11.0.3-base-ubuntu20.04 nvidia-smi
Would it be possible to test this with nerdctl
instead? or ensure that the RUNTIME is set instead of using the --gpus 0
flag?
Also, could you provide information on the version of the device plugin you are using, the driver version, and the version of the NVIDIA Container Toolkit.
Note that the following command doesn't use the same code path for injecting GPUs as what K8s does.
ctr run --rm --gpus 0 -t docker.io/nvidia/cuda:11.0.3-base-ubuntu20.04 cuda-11.0.3-base-ubuntu20.04 nvidia-smi
Would it be possible to test this with
nerdctl
instead? or ensure that the RUNTIME is set instead of using the--gpus 0
flag?Also, could you provide information on the version of the device plugin you are using, the driver version, and the version of the NVIDIA Container Toolkit.
Two test case for above suggestions.
nerdctl
instead of ctr
nerdctl run --network=host --rm --gpus 0 -t docker.io/nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
No devices were found
2. Use `--runtime io.containerd.runc.v1` instead of `--gpus 0`
ctr run --runtime io.containerd.runc.v1 --rm -t docker.io/nvidia/cuda:11.0.3-base-ubuntu20.04 cuda-11.0.3-base-ubuntu20.04 nvidia-smi
ctr: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "nvidia-smi": executable file not found in $PATH: unknown
which nvidia-smi /bin/nvidia-smi
Device plugin:
nvidia-k8s-device-plugin:1.0.0-beta6
NVIDIA packages version:
rpm -qa 'nvidia' libnvidia-container-tools-1.3.1-1.x86_64 nvidia-container-runtime-3.4.0-1.x86_64 libnvidia-container1-1.3.1-1.x86_64 nvidia-docker2-2.5.0-1.noarch nvidia-container-toolkit-1.4.0-2.x86_64
NVIDIA container library version:
nvidia-container-cli -V version: 1.3.1 build date: 2020-12-14T14:18+0000 build revision: ac02636a318fe7dcc71eaeb3cc55d0c8541c1072 build compiler: gcc 4.8.5 20150623 (Red Hat 4.8.5-44) build platform: x86_64 build flags: -D_GNU_SOURCE -D_FORTIFY_SOURCE=2 -DNDEBUG -std=gnu11 -O2 -g -fdata-sections -ffunction-sections -fstack-protector -fno-strict-aliasing -fvisibility=hidden -Wall -Wextra -Wcast-align -Wpointer-arith -Wmissing-prototypes -Wnonnull -Wwrite-strings -Wlogical-op -Wformat=2 -Wmissing-format-attribute -Winit-self -Wshadow -Wstrict-prototypes -Wunreachable-code -Wconversion -Wsign-conversion -Wno-unknown-warning-option -Wno-format-extra-args -Wno-gnu-alignof-expression -Wl,-zrelro -Wl,-znow -Wl,-zdefs -Wl,--gc-sections
We strace the nvidia-smi
process in the container and found access the /dev/nvidiactl
device not permitted.
@zvier those are very old versions for all the packages and the device plugin.
Would you be able to try with the latest versions:
Also not work well. But it can work if I add securityContext
filed in my pod yaml like this:
apiVersion: v1
kind: Pod
metadata:
name: gpu-operator-test
spec:
restartPolicy: OnFailure
containers:
- name: cuda-vector-add
image: "docker.io/nvidia/cuda:11.0.3-base-ubuntu20.04"
command:
- sleep
- "36000"
resources:
limits:
nvidia.com/gpu: 1
securityContext:
privileged: true
nodeName: test-node-1
So to summarise. If you update the versions to the latest AND run the test pod in privileged
then you're able to run nvidia-smi
in the container.
This is expected since this would mount all of /dev/nv*
into the container regardless and would then avoid the permission errors on /dev/nvidiactl
.
Could you enable debug output for the nvidia-container-cli
by uncommenting the #debug =
lines in /etc/nvidia-contianer-runtime/config.toml
and then including the output from /var/log/nvidia-container-toolkit.log
here?
You should also be able to use ctr
directly in this case by running something like:
sudo ctr run --rm -t \
--runc-binary=/usr/bin/nvidia-container-runtime \
--env NVIDIA_VISIBLE_DEVICES=all \
docker.io/nvidia/cuda:11.0.3-base-ubuntu20.04 \
cuda-11.0.3-base-ubuntu20.04 nvidia-smi
(note how the runc-binary
is set to the nvidia-container-runtime
).
ctr run --rm -t \ --runc-binary=/usr/bin/nvidia-container-runtime \ --env NVIDIA_VISIBLE_DEVICES=all \ docker.io/nvidia/cuda:11.0.3-base-ubuntu20.04 \ cuda-11.0.3-base-ubuntu20.04 nvidia-smi
If test the pod with privileged
, update nvidia versions is no needed.
After uncommenting the #debug =
lines in /etc/nvidia-contianer-runtime/config.toml
and run ctr run
command, it print ok. The output of /var/log/nvidia-container-toolkit.log
is:
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:39+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:39+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:39+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:39+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:39+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:44+08:00"}
{"level":"info","msg":"Using OCI specification file path: /run/containerd/io.containerd.runtime.v2.task/default/cuda-11.0.3-base-ubuntu20.04/config.json","time":"2022-07-16T07:17:44+08:00"}
{"level":"info","msg":"Auto-detected mode as 'legacy'","time":"2022-07-16T07:17:44+08:00"}
{"level":"info","msg":"Using prestart hook path: /usr/bin/nvidia-container-runtime-hook","time":"2022-07-16T07:17:44+08:00"}
{"level":"info","msg":"Applied required modification to OCI specification","time":"2022-07-16T07:17:44+08:00"}
{"level":"info","msg":"Forwarding command to runtime","time":"2022-07-16T07:17:44+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:44+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:45+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:45+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:49+08:00"}
{"level":"info","msg":"Using low-level runtime /usr/bin/runc","time":"2022-07-16T07:17:49+08:00"}
If my container runtime is containerd, the /etc/nvidia-container-runtime/config.toml
is:
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false
[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
#no-cgroups = false
#user = "root:video"
ldconfig = "@/sbin/ldconfig"
[nvidia-container-runtime]
debug = "/var/log/nvidia-container-runtime.log"
log-level = "info"
# Specify the runtimes to consider. This list is processed in order and the PATH
# searched for matching executables unless the entry is an absolute path.
runtimes = [
"docker-runc",
"runc",
]
mode = "auto"
[nvidia-container-runtime.modes.csv]
mount-spec-path = "/etc/nvidia-container-runtime/host-files-for-container.d"
If my container runtime is dockerd, the /etc/nvidia-container-runtime/config.toml
is:
disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false
[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
#debug = "/var/log/nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
#no-cgroups = false
#user = "root:video"
ldconfig = "@/sbin/ldconfig"
[nvidia-container-runtime]
#debug = "/var/log/nvidia-container-runtime.log"
@elezar Hi,I have encountered the similar problem~ The permissions of /dev/nvidia* are 'rw', but nvidia-smi failed.
I find that the permissions in devices.list are not right.
I try to use root to echo "c 195:* rwm" > /sys/fs/cgroup/devices/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podf9413023_9640_4bd8_b76f_b1b629642012.slice/cri-containerd-c33389a1c755d1d6fe2de531890db4bc5e821e41646ac6d2ff7aa83662f00c9e.scope/devices.allow
/sys/fs/cgroup/devices/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podf9413023_9640_4bd8_b76f_b1b629642012.slice/cri-containerd-c33389a1c755d1d6fe2de531890db4bc5e821e41646ac6d2ff7aa83662f00c9e.scope/devices.list
changed as expected.
But after a moment, the devices.list restored.Maybe it's the problem. Kubelet and containerd may update the cgroup devices at regular intervals. How to solve it? Thanks!
Are you running the plugin with the --pass-device-specs
option? This flag was designed to avoid this exact issue: https://github.com/NVIDIA/k8s-device-plugin#as-command-line-flags-or-envvars
Are you running the plugin with the
--pass-device-specs
option? This flag was designed to avoid this exact issue: https://github.com/NVIDIA/k8s-device-plugin#as-command-line-flags-or-envvars
I find that runc update
may also change the devices.list.
setUnitProperties(m.dbus, unitName, properties...)
changes the devices.list through systemd.
The properties are made by genV1ResourcesProperties
, deviceProperties in properties will include entry.Path = fmt.Sprintf("/dev/char/%d:%d", rule.Major, rule.Minor)
/dev/nvidiactl can not be found in /dev/char/195:255, the right format should be DeviceAllow=/dev/char/195:255 rw
I wan to make a PR to runc like this ` // " n:m " rules are just a path in /dev/{block,char}/. switch rule.Type { case devices.BlockDevice: entry.Path = fmt.Sprintf("/dev/block/%d:%d", rule.Major, rule.Minor) case devices.CharDevice: entry.Path = getCharEntryPath(rule) }
func isNVIDIADevice(rule *devices.Rule) bool { // NVIDIA device has major 195 and 507 if rule.Major == 195 || rule.Major == 507 { return true } return false }
func getNVIDIAEntryPath(rule *devices.Rule) string { str := "/dev/" switch rule.Major { case 195: switch rule.Minor { case 254: str = str + "nvidia-modeset" case 255: str = str + "nvidiactl" default: str = str + "nvidia" + strconv.Itoa(int(rule.Minor)) } case 507: switch rule.Minor { case 0: str = str + "nvidia-uvm" case 1: str = str + "nvidia-uvm-tools" } } return str }
func getCharEntryPath(rule *devices.Rule) string { if isNVIDIADevice(rule) { return getNVIDIAEntryPath(rule) } return fmt.Sprintf("/dev/char/%d:%d", rule.Major, rule.Minor) } `
Do you meet the same problem? Thank you! @klueska
@klueska Hi,I have encountered the same problem. I used the command cat /var/lib/kubelet/cpu_manager_state and got the following output:
{"policyName":"none","defaultCpuSet":"","checksum":1353318690}
Does this mean that the issue with the cpuset does not exist, and therefore it is not necessary to pass the PASS_DEVICE_SPECS parameter when starting?
Thanks for the confirmation @zvier.
@gwgrisk Note that with newer versions of systemd and using systemd cgroup management, it is also required to specify the PASS_DEVICE_SPECS
option. It is thus no longer limited to interactions with GPU manager since any systemd reload will trigger a container to lose access to the underlying device nodes in this case.
This issue is stale because it has been open 90 days with no activity. This issue will be closed in 30 days unless new comments are made or the stale label is removed.
1. Issue or feature description
After change the k8s container runtime from docker to containerd, we execute
nvidia-smi
in a k8s GPU POD, it returns error withFailed to initialize NVML: Unknown Error
and the pod cannot work well.2. Steps to reproduce the issue
I configured my containerd referenced https://docs.nvidia.com/datacenter/cloud-native/kubernetes/install-k8s.html#install-nvidia-container-toolkit-nvidia-docker2. The containerd diff config is:
Then, I run the base test case with ctr command, it passed and return expectly.
When created the GPU pod from k8s, the pod alos can running, but execute
nvidia-smi
in pod it returns error withFailed to initialize NVML: Unknown Error
. The test pod yaml is:3. Information to attach (optional if deemed irrelevant)
I think the nvidia config in my host is right. the only change is the container runtime we use containerd directly instead of docker. And if we used docker as runtime it works well.
Common error checking:
Additional information that might help better understand your environment and reproduce the bug:
containerd -v
1.6.5uname -a
4.18.0-2.4.3