NVIDIA / k8s-device-plugin

NVIDIA device plugin for Kubernetes
Apache License 2.0
2.5k stars 583 forks source link

Crio integration? #62

Closed jordimassaguerpla closed 5 years ago

jordimassaguerpla commented 5 years ago

Hi

I am trying to use crio with nvidia-runtime-hook, as explained in (1)However, after creating this daemonset, I run 'kubectl describe nodes" and I don't see any mention to nvidia gpus, plus the pods that require it are in pending state.

Have you tried this with crio? Have you instructions on how to make it work? And how can I debug it and get more info?

Thanks

jordimassaguerpla commented 5 years ago

The link about setting up cri-o

https://github.com/kubernetes-incubator/cri-o/issues/1222

RenaudWasTaken commented 5 years ago

Hello!

Openshif uses CRIO and has a pretty good guide on this that transfers well to vanilla: https://blog.openshift.com/use-gpus-with-device-plugin-in-openshift-3-9/

If you have any errors don't hesitate to ask here :) Closing in the meantime.

jordimassaguerpla commented 5 years ago

Hi. If I look at the kubelet log, I see this:

journalctl -u kubelet

Jul 16 08:13:16 gpu hyperkube[22503]: I0716 08:13:16.795036 22503 nvidia.go:110] NVML initialized. Number of nvidia devices: 1

So my guess is that something is going well here.

But then, if I do

kubectl describe pods | grep nvidia | grep gpu

I got nothing. I would expect to see a node that has gpu resources ... am I assuming wrong?

How can this be debug? Are there logs for the nvidia plugin that I could look at?

thanks

jordimassaguerpla commented 5 years ago

I think I see where the issue may be, I don't have any pod running named nvidia-device-plugin-ctr, however I couldn't see any error when deploying https://github.com/NVIDIA/k8s-device-plugin/blob/v1.9/nvidia-device-plugin.yml.

Could you tell me where should I look for errors or how to debug this?

thanks

RenaudWasTaken commented 5 years ago

Hello!

Thanks!

jordimassaguerpla commented 5 years ago

Hi!

First thanks for your quick answer :)

I deployed k8s using SUSE CaaSP. I am working at SUSE and this was my hackweek project actually.

The node is a physical workstation with a nvidia card Geforce GTX 1060. The kubernetes master is running on kvm as a vm on my laptop.

I don't understand which logs. I run "kubectl create -f ....yaml" and didn't get much. Which logs are you referring to? I looked into the different services by using journalctl and didn't see much, but I might have look for the wrong things... or should I use "kubectl logs "?

What you need from the node? It has the gpu I said, 12GB of RAM, the disc is an external USB and has intel Xeon. It is a DELL Precision Workstation T3500. Do you need further info?

I know this is very vague but it would be great if you could give me some hints on specially which logs to look at and such.

Again, thanks a lot

jordi

jordimassaguerpla commented 5 years ago

Don't known if this is relevant, but here the output of running "nvidia-container-cli info"

NVRM version: 390.67 CUDA version: 9.1

Device Index: 0 Device Minor: 0 Model: GeForce GTX 1060 3GB GPU UUID: GPU-f96a76d4-7ba9-07cc-2774-bb7a55ef3e68 Bus Location: 00000000:00.0 Architecture: 6.1

RenaudWasTaken commented 5 years ago

Hello!

jordimassaguerpla commented 5 years ago

Hi Renaud,

I wasn't looking at the kube-system namespace ... my fault.

Here is the output of running describe on the daemonset. Looks like the problem is the Pod Security Policy my user has assigned by default, which prevents mounting a host path for security issues:

kubectl describe ds -n kube-system nvidia-device-plugin-daemonset
Name: nvidia-device-plugin-daemonset
Selector: name=nvidia-device-plugin-ds
Node-Selector:
Labels: name=nvidia-device-plugin-ds
Annotations:
Desired Number of Nodes Scheduled: 0
Current Number of Nodes Scheduled: 0
Number of Nodes Scheduled with Up-to-date Pods: 0
Number of Nodes Scheduled with Available Pods: 0
Number of Nodes Misscheduled: 0
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: name=nvidia-device-plugin-ds
Annotations: scheduler.alpha.kubernetes.io/critical-pod=
Containers:
nvidia-device-plugin-ctr:
Image: nvidia/k8s-device-plugin:1.9
Port:
Host Port:
Environment:
Mounts:
/var/lib/kubelet/device-plugins from device-plugin (rw)
Volumes:
device-plugin:
Type: HostPath (bare host directory volume)
Path: /var/lib/kubelet/device-plugins
HostPathType:
Events:
Type Reason Age From Message


Warning FailedCreate 13m (x19 over 35m) daemonset-controller Error creating: pods "nvidia-device-plugin-daemonset-" is forbidden: unable to validate against any pod security policy: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used]

jordimassaguerpla commented 5 years ago

So I disabled PodSecurityPolicy and I was able to start the containers.

Here the log of the container:

2018/07/17 11:00:16 Loading NVML 2018/07/17 11:00:16 Failed to initialize NVML: could not load NVML library. 2018/07/17 11:00:16 If this is a GPU node, did you set the docker default runtime to nvidia? 2018/07/17 11:00:16 You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites 2018/07/17 11:00:16 You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start

and here the description of the gpu node:

2018/07/17 11:00:16 You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start Name: gpu Roles: Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/hostname=gpu Annotations: flannel.alpha.coreos.com/backend-data={"VtepMAC":"de:d1:41:84:46:46"} flannel.alpha.coreos.com/backend-type=vxlan flannel.alpha.coreos.com/kube-subnet-manager=true flannel.alpha.coreos.com/public-ip=192.168.1.195 node.alpha.kubernetes.io/ttl=0 volumes.kubernetes.io/controller-managed-attach-detach=true CreationTimestamp: Sun, 15 Jul 2018 15:39:51 +0200 Taints: Unschedulable: false Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message


OutOfDisk False Tue, 17 Jul 2018 13:03:21 +0200 Sun, 15 Jul 2018 15:39:51 +0200 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Tue, 17 Jul 2018 13:03:21 +0200 Tue, 17 Jul 2018 12:51:19 +0200 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Tue, 17 Jul 2018 13:03:21 +0200 Tue, 17 Jul 2018 12:51:19 +0200 KubeletHasNoDiskPressure kubelet has no disk pressure Ready True Tue, 17 Jul 2018 13:03:21 +0200 Tue, 17 Jul 2018 12:59:51 +0200 KubeletReady kubelet is posting ready status. AppArmor enabled Addresses: InternalIP: 192.168.1.195 Hostname: gpu Capacity: cpu: 2 memory: 12295404Ki pods: 110 Allocatable: cpu: 2 memory: 12193004Ki pods: 110 System Info: Machine ID: 259a7be9d5d248a08c6485a952818cbd System UUID: 44454C4C-4800-1053-8034-B3C04F37354A
Boot ID: 9c9fe62f-4605-4adb-a71d-8f1bb7531971
Kernel Version: 4.4.138-59-default
OS Image: SUSE CaaS Platform 3.0
Operating System: linux
Architecture: amd64
Container Runtime Version: cri-o://1.9.13 Kubelet Version: v1.9.8 Kube-Proxy Version: v1.9.8 PodCIDR: 172.16.2.0/23 ExternalID: gpu Non-terminated Pods: (15 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits


default frontend-67f65745c-g8d64 100m (5%) 0 (0%) 100Mi (0%) 0 (0%) default frontend-67f65745c-ppcvm 100m (5%) 0 (0%) 100Mi (0%) 0 (0%) default frontend-67f65745c-rt46z 100m (5%) 0 (0%) 100Mi (0%) 0 (0%) default nvidia-smi-6 0 (0%) 0 (0%) 0 (0%) 0 (0%) default nvidia-smi-66 0 (0%) 0 (0%) 0 (0%) 0 (0%) default redis-master-585798d8ff-rfx5l 100m (5%) 0 (0%) 100Mi (0%) 0 (0%) default redis-slave-865486c9df-gwtmq 100m (5%) 0 (0%) 100Mi (0%) 0 (0%) default redis-slave-865486c9df-tvzm7 100m (5%) 0 (0%) 100Mi (0%) 0 (0%) kube-system dex-b55d98998-52sxv 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system dex-b55d98998-9lx49 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system haproxy-gpu 0 (0%) 0 (0%) 128Mi (1%) 128Mi (1%) kube-system kube-dns-7488679ff9-6xmgk 260m (13%) 0 (0%) 110Mi (0%) 170Mi (1%) kube-system kube-dns-7488679ff9-s4nt7 260m (13%) 0 (0%) 110Mi (0%) 170Mi (1%) kube-system kube-flannel-t6wmr 0 (0%) 0 (0%) 0 (0%) 0 (0%) kube-system nvidia-device-plugin-daemonset-4vttt 0 (0%) 0 (0%) 0 (0%) 0 (0%) Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) CPU Requests CPU Limits Memory Requests Memory Limits


1120m (56%) 0 (0%) 948Mi (7%) 468Mi (3%) Events: Type Reason Age From Message


Normal NodeReady 12m (x2 over 2h) kubelet, gpu Node gpu status is now: NodeReady Normal NodeHasSufficientMemory 12m (x55 over 1h) kubelet, gpu Node gpu status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 12m (x55 over 1h) kubelet, gpu Node gpu status is now: NodeHasNoDiskPressure Normal Starting 3m kubelet, gpu Starting kubelet. Normal NodeHasSufficientDisk 3m (x2 over 3m) kubelet, gpu Node gpu status is now: NodeHasSufficientDisk Normal NodeHasSufficientMemory 3m (x2 over 3m) kubelet, gpu Node gpu status is now: NodeHasSufficientMemory Normal NodeHasNoDiskPressure 3m (x2 over 3m) kubelet, gpu Node gpu status is now: NodeHasNoDiskPressure Normal NodeAllocatableEnforced 3m kubelet, gpu Updated Node Allocatable limit across pods Normal NodeNotReady 3m kubelet, gpu Node gpu status is now: NodeNotReady Normal NodeReady 3m kubelet, gpu Node gpu status is now: NodeReady

jordimassaguerpla commented 5 years ago

The error seems to happen here:

https://github.com/NVIDIA/k8s-device-plugin/blob/v1.9/vendor/github.com/NVIDIA/nvidia-docker/src/nvml/bindings.go#L58

If I understand correctly, this means it cannot load the libnvidia-ml.so.1 library

https://github.com/NVIDIA/k8s-device-plugin/blob/v1.11/vendor/github.com/NVIDIA/nvidia-docker/src/nvml/nvml_dl.c#L23

I don't understand thought how loading a library within a container has anything to do with having the library installed in the system. What am I missing?

RenaudWasTaken commented 5 years ago

Hello,

2018/07/17 11:00:16 If this is a GPU node, did you set the docker default runtime to nvidia? 2018/07/17 11:00:16 You can check the prerequisites at: https://github.com/NVIDIA/k8s-device-plugin#prerequisites 2018/07/17 11:00:16 You can learn how to set the runtime at: https://github.com/NVIDIA/k8s-device-plugin#quick-start

Did you set the docker default runtime to nvidia? Are you using the docker CRI runtime or the containerd runtime?

Thanks

jordimassaguerpla commented 5 years ago

I am using the docker-runtime-hook with cri-o, as explained in https://blog.openshift.com/use-gpus-with-device-plugin-in-openshift-3-9/

I have to run "chmod 0666 /dev/nvidia*" everytime. On every reboot and after restarting kubelet. Don't know if it can be related.

I see in the kubelet logs

Jul 16 08:13:16 gpu hyperkube[22503]: I0716 08:13:16.795036 22503 nvidia.go:110] NVML initialized. Number of nvidia devices: 1

So I think something worked here. But then, describing the node (kubectl describe) does not say anything about nvidia gpus.

RenaudWasTaken commented 5 years ago

Sorry for dropping this issue, @jordimassaguerpla are you still hitting this bug?

jordimassaguerpla commented 5 years ago

Hi, I moved to another task (this was my Hackweek project :) ). @danielorf : is this still relevant to you?

danielorf commented 5 years ago

@jordimassaguerpla I was able to eventually work around our problems and get the nvidia-runtime-hook to work. I did seem to find that I could not get annotations to match correctly and had to rely on the CMD matching. I ran out of time to fully investigate though and never made a proper bug.

RenaudWasTaken commented 5 years ago

Thanks, closing feel free to reply here if you ever get to this bug again

jordimassaguerpla commented 4 years ago

Hi! Just a heads up I tried this again (it is again SUSE hackweek :) ). I found that I had to set this value:

user = "root:video"

into /etc/nvidia-container-runtime/config.toml

IIUC the key is the video group

Then, I was able to run crio with podman:

jordi@gpu:~> sudo podman run nvidia/cuda nvidia-smi
Wed Jun 26 15:34:43 2019
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.26 Driver Version: 430.26 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Quadro K2000 Off | 00000000:05:00.0 Off | N/A | | 30% 46C P8 N/A / N/A | 0MiB / 1998MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

:)