Open gongysh2004 opened 2 days ago
Privileged Pods have direct access to the host's devices—they share the host's device namespace and can directly access everything under the /dev
directory. This basically bypasses the container's device isolation
So, in our HAMi webhook:
if ctr.SecurityContext.Privileged != nil && *ctr.SecurityContext.Privileged {
klog.Warningf(template+" - Denying admission as container %s is privileged", req.Namespace, req.Name, req.UID, c.Name)
continue
}
the code just skips handling privileged Pods altogether, which means they fall back to being scheduled by the default scheduler. You can see from the Events you posted that it's scheduled by the default-scheduler
So, the reason scheduling fails when resources.limits
includes nvidia.com/gpumem
is that the default-scheduler
doesn’t recognize nvidia.com/gpumem
What happened: container with privilege context failed to be scheduled What you expected to happen: should be scheduled How to reproduce it (as minimally and precisely as possible): install the hami according to the install steps, then run the following deployment:
Anything else we need to know?:
nvidia-smi -a
on your host/etc/docker/daemon.json
)root@node7vm-1:~/test# helm ls -A | grep hami hami kube-system 2 2024-11-14 15:18:36.886955318 +0800 CST deployed hami-2.4.0 2.4.0
my-hami-webui kube-system 4 2024-11-14 17:18:24.678439025 +0800 CST deployed hami-webui-1.0.3 1.0.3
root@node7bm-1:~# nvidia-smi Thu Nov 14 15:58:33 2024
+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.161.08 Driver Version: 535.161.08 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA L40S On | 00000000:08:00.0 Off | Off | | N/A 27C P8 22W / 350W | 0MiB / 49140MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 1 NVIDIA L40S On | 00000000:09:00.0 Off | Off | | N/A 28C P8 21W / 350W | 0MiB / 49140MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 2 NVIDIA L40S On | 00000000:0E:00.0 Off | Off | | N/A 26C P8 19W / 350W | 0MiB / 49140MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 3 NVIDIA L40S On | 00000000:11:00.0 Off | Off | | N/A 26C P8 21W / 350W | 0MiB / 49140MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 4 NVIDIA L40S On | 00000000:87:00.0 Off | Off | | N/A 26C P8 21W / 350W | 0MiB / 49140MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 5 NVIDIA L40S On | 00000000:8D:00.0 Off | Off | | N/A 26C P8 21W / 350W | 0MiB / 49140MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 6 NVIDIA L40S On | 00000000:90:00.0 Off | Off | | N/A 26C P8 21W / 350W | 0MiB / 49140MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 7 NVIDIA L40S On | 00000000:91:00.0 Off | Off | | N/A 27C P8 19W / 350W | 0MiB / 49140MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+
Linux node7vm-1 5.15.0-125-generic #135-Ubuntu SMP Fri Sep 27 13:53:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Events: Type Reason Age From Message
Normal Scheduled 19s default-scheduler Successfully assigned default/gpu-test-5f9f7d48d9-4wsrp to node7bm-1 Warning UnexpectedAdmissionError 20s kubelet Allocate failed due to rpc error: code = Unknown desc = no binding pod found on node node7bm-1, which is unexpected
Events: Type Reason Age From Message
Warning FailedScheduling 14s default-scheduler 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 Insufficient nvidia.com/gpumem. preemption: 0/3 nodes are available: 1 Preemption is not helpful for scheduling, 2 No preemption victims found for incoming pod..