NVIDIA / gpu-operator

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html
Apache License 2.0
1.78k stars 286 forks source link

Feature Request: add option to deploy `PodSecurityPolicy` for driver daemonset via Helm Chart #185

Closed d-m closed 3 years ago

d-m commented 3 years ago

After deploying the gpu-operator Helm chart on a cluster with pod security policies enabled, adding a GPU instance to the cluster results in the following events:

$ kubectl get events -n gpu-operator-resources                                                                                                                                                         
LAST SEEN   TYPE      REASON         OBJECT                              MESSAGE
81s         Warning   FailedCreate   daemonset/nvidia-driver-daemonset   Error creating: pods "nvidia-driver-daemonset-" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.securityContext.hostPID: Invalid value: true: Host PID is not allowed to be used spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]

Adding a PodSecurityPolicy with these permissions and associated Role and RoleBinding for the nvidia-driver service account fixes the issue.

shivamerla commented 3 years ago

@d-m This is one the feature that will be part of 1.7.0 release. Its still in review: https://gitlab.com/nvidia/kubernetes/gpu-operator/-/merge_requests/207

Is it possible for your to validate with private build on your system?

d-m commented 3 years ago

Thanks @shivamerla! Yes I can do that. I'll try today or tomorrow.

shivamerla commented 3 years ago

Thanks @d-m. Please make sure to delete old clusterpolicies CRD's, gpu-operator clusterroles/bindings before you deploy this, just in case if they are lying around.

d-m commented 3 years ago

I tried installing the operator from your branch specified in the MR and received the following error:

$ kubectl logs -n kube-system gpu-operator-7d8ffb476f-b8b8x                                                                                                                                                                        
unknown flag: --leader-elect
Usage of gpu-operator:
      --zap-devel                        Enable zap development mode (changes defaults to console encoder, debug log level, disables sampling and stacktrace from 'warning' level)
      --zap-encoder encoder              Zap log encoding ('json' or 'console')
      --zap-level level                  Zap log level (one of 'debug', 'info', 'error' or any integer value > 0) (default info)
      --zap-sample sample                Enable zap log sampling. Sampling will be disabled for integer log levels > 1
      --zap-stacktrace-level level       Set the minimum log level that triggers stacktrace generation (default error)
      --zap-time-encoding timeEncoding   Sets the zap time format ('epoch', 'millis', 'nano', or 'iso8601') (default )
unknown flag: --leader-elect

When I deleted the --leader-elect flag from the template, I got a new error:

$ kubectl logs -n kube-system gpu-operator-665fdc747f-497q6                                                                                                                                                                        
{"level":"info","ts":1620251700.8265946,"logger":"cmd","msg":"Go Version: go1.13.15"}
{"level":"info","ts":1620251700.8266578,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1620251700.8266659,"logger":"cmd","msg":"Version of operator-sdk: v0.17.0"}
{"level":"info","ts":1620251700.8269262,"logger":"leader","msg":"Trying to become the leader."}
{"level":"error","ts":1620251702.480583,"logger":"cmd","msg":"","error":"required env POD_NAME not set, please configure downward API","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.1.1/zapr.go:128\nmain.main\n\t/go/src/github.com/NVIDIA/gpu-operator/cmd/manager/main.go:69\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:203"}

I verified that the downward API was configured in the deployed pod security policy, however it looks like there are some other changes when compared to the 1.6.2 version of the helm chart which has POD_NAME defined.

Is there a development version of the image that goes along with these changes? It's still specified as 1.6.2 and I didn't see anything newer at https://ngc.nvidia.com/catalog/containers/nvidia:gpu-operator/tags.

shivamerla commented 3 years ago

you would need to build a private image from master branch. Or you can use quay.io/shivamerla/gpu-operator:psp image.

d-m commented 3 years ago

@shivamerla I'll try that today.

d-m commented 3 years ago

Looks like the new helm chart deploys successfully once I used the updated image.

However, now I'm running into the following error with the nvidia-device-plugin-daemonset:

Warning  FailedCreatePodSandBox  67s (x49 over 11m)  kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox runtime: no runtime for "nvidia" is configured

This might be unrelated, so I'll double check the documentation to make sure I didn't miss something.

shivamerla commented 3 years ago

@d-m Thanks for checking. Please make sure right runtime is passed during install with --set operator.defaultRuntime= with either docker or containerd.

d-m commented 3 years ago

I have it set to containerd and the toolkit seems to complete successfully:

time="2021-05-06T14:18:32Z" level=info msg="Starting 'setup' for containerd"
time="2021-05-06T14:18:32Z" level=info msg="Parsing arguments: [/usr/local/nvidia/toolkit]"
time="2021-05-06T14:18:32Z" level=info msg="Successfully parsed arguments"
time="2021-05-06T14:18:32Z" level=info msg="Loading config: /runtime/config-dir/config.toml"
time="2021-05-06T14:18:32Z" level=info msg="Config file does not exist, creating new one"
time="2021-05-06T14:18:32Z" level=info msg="Successfully loaded config"
time="2021-05-06T14:18:32Z" level=info msg="Containerd version is v1.4.4"
time="2021-05-06T14:18:32Z" level=info msg="Config version: 2"
time="2021-05-06T14:18:32Z" level=info msg="Updating config"
time="2021-05-06T14:18:32Z" level=info msg="Successfully updated config"
time="2021-05-06T14:18:32Z" level=info msg="Flushing config"
time="2021-05-06T14:18:32Z" level=info msg="Successfully flushed config"
time="2021-05-06T14:18:32Z" level=info msg="Sending SIGHUP signal to containerd"
time="2021-05-06T14:18:32Z" level=info msg="Successfully signaled containerd"
time="2021-05-06T14:18:32Z" level=info msg="Completed 'setup' for containerd"
time="2021-05-06T14:18:32Z" level=info msg="Waiting for signal"
shivamerla commented 3 years ago

Do you see nvidia runtimeClass object created in gpu-operator-resources namespace and [plugins.cri.containerd.runtimes.nvidia] stanza set in /etc/containerd/config.toml?

d-m commented 3 years ago

Yep! However, we deploy the cluster with kops and it looks like kops uses /etc/containerd/config-kops.toml for its containerd configuration. I copied the configuration that the container-toolkit container put in config.toml to config-kops.toml, reloaded the containerd config, and the device-plugin container ran successfully.

Is it possible to override the containerd config location via the helm chart?

shivamerla commented 3 years ago

Yes, you can pass --set toolkit.env[0].name=CONTAINERD_CONFIG --set toolkit.env[0].value="path"

d-m commented 3 years ago

Just found that in the codebase as you commented. I'll give it a shot

d-m commented 3 years ago

That did the trick, thanks for your help!

aktiver commented 2 years ago

Does anyone know what the toml file config is for K3s?