squat / generic-device-plugin

A Kubernetes device plugin to schedule generic Linux devices
Apache License 2.0
198 stars 22 forks source link

Error on missing device #36

Closed Ruakij closed 1 year ago

Ruakij commented 1 year ago

Hello, i wanted to use generic-device-plugin to schedule pods which need HW-accelerated video (Intel Quicksync).

This is my setup:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: generic-device-plugin
  namespace: kube-system
  labels:
    app.kubernetes.io/name: generic-device-plugin
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: generic-device-plugin
  template:
    metadata:
      labels:
        app.kubernetes.io/name: generic-device-plugin
    spec:
      priorityClassName: system-node-critical
      tolerations:
        - operator: "Exists"
          effect: "NoExecute"
        - operator: "Exists"
          effect: "NoSchedule"
      containers:
        - image: ghcr.io/squat/generic-device-plugin
          args:
            - --domain
            - generic-device
            - --device
            - |
              name: serial
              groups:
                - paths:
                    - path: /dev/ttyUSB*
                - paths:
                    - path: /dev/ttyACM*
                - paths:
                    - path: /dev/tty.usb*
                - paths:
                    - path: /dev/cu.*
                - paths:
                    - path: /dev/cuaU*
                - paths:
                    - path: /dev/rfcomm*
            - --device
            - |
              name: video
              groups:
                - paths:
                    - path: /dev/video0
            - --device
            - |
              name: dri
              groups:
                - count: 10
                  paths:
                    - path: /dev/dri/renderD128
                    - path: /dev/dri/card0
            - --device
            - |
              name: fuse
              groups:
                - count: 10
                  paths:
                    - path: /dev/fuse
            - --device
            - |
              name: audio
              groups:
                - count: 10
                  paths:
                    - path: /dev/snd
            - --device
            - |
              name: capture
              groups:
                - paths:
                    - path: /dev/snd/controlC0
                    - path: /dev/snd/pcmC0D0c
                - paths:
                    - path: /dev/snd/controlC1
                      mountPath: /dev/snd/controlC0
                    - path: /dev/snd/pcmC1D0c
                      mountPath: /dev/snd/pcmC0D0c
                - paths:
                    - path: /dev/snd/controlC2
                      mountPath: /dev/snd/controlC0
                    - path: /dev/snd/pcmC2D0c
                      mountPath: /dev/snd/pcmC0D0c
                - paths:
                    - path: /dev/snd/controlC3
                      mountPath: /dev/snd/controlC0
                    - path: /dev/snd/pcmC3D0c
                      mountPath: /dev/snd/pcmC0D0c
          name: generic-device-plugin
          resources:
            requests:
              cpu: 50m
              memory: 10Mi
          ports:
            - containerPort: 8080
              name: http
          securityContext:
            privileged: true
          volumeMounts:
            - name: device-plugin
              mountPath: /var/lib/kubelet/device-plugins
            - name: dev
              mountPath: /dev
      volumes:
        - name: device-plugin
          hostPath:
            path: /var/lib/kubelet/device-plugins
        - name: dev
          hostPath:
            path: /dev
  updateStrategy:
    type: RollingUpdate

As you can see, i request /dev/dri/renderD128 aswell as /dev/dri/card0.

But not all nodes actually have a /dev/dri/renderD128, so these pods crash:

{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"generic-device/capture\".","ts":"2023-07-30T17:48:36.817420355Z"}
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"generic-device/video\".","ts":"2023-07-30T17:48:36.817529452Z"}
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"generic-device/serial\".","ts":"2023-07-30T17:48:36.817756832Z"}
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"generic-device/fuse\".","ts":"2023-07-30T17:48:36.81795204Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"generic-device/fuse","socket":"/var/lib/kubelet/device-plugins/gdp-Z2VuZXJpYy1kZXZpY2UvZnVzZQ==-1690739316.sock","ts":"2023-07-30T17:48:36.81810015Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"generic-device/serial","socket":"/var/lib/kubelet/device-plugins/gdp-Z2VuZXJpYy1kZXZpY2Uvc2VyaWFs-1690739316.sock","ts":"2023-07-30T17:48:36.818275051Z"}
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"generic-device/dri\".","ts":"2023-07-30T17:48:36.818504645Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"generic-device/dri","socket":"/var/lib/kubelet/device-plugins/gdp-Z2VuZXJpYy1kZXZpY2UvZHJp-1690739316.sock","ts":"2023-07-30T17:48:36.818620283Z"}
{"caller":"plugin.go:174","level":"info","msg":"waiting for the gRPC server to be ready","resource":"generic-device/fuse","ts":"2023-07-30T17:48:36.818764636Z"}
{"caller":"main.go:218","msg":"Starting the generic-device-plugin for \"generic-device/audio\".","ts":"2023-07-30T17:48:36.818575279Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"generic-device/audio","socket":"/var/lib/kubelet/device-plugins/gdp-Z2VuZXJpYy1kZXZpY2UvYXVkaW8=-1690739316.sock","ts":"2023-07-30T17:48:36.819089851Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"generic-device/video","socket":"/var/lib/kubelet/device-plugins/gdp-Z2VuZXJpYy1kZXZpY2UvdmlkZW8=-1690739316.sock","ts":"2023-07-30T17:48:36.818050307Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"generic-device/dri","ts":"2023-07-30T17:48:36.81943341Z"}
{"caller":"plugin.go:174","level":"info","msg":"waiting for the gRPC server to be ready","resource":"generic-device/dri","ts":"2023-07-30T17:48:36.819574657Z"}
{"caller":"plugin.go:174","level":"info","msg":"waiting for the gRPC server to be ready","resource":"generic-device/audio","ts":"2023-07-30T17:48:36.819797719Z"}
{"caller":"plugin.go:174","level":"info","msg":"waiting for the gRPC server to be ready","resource":"generic-device/serial","ts":"2023-07-30T17:48:36.818663114Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"generic-device/fuse","ts":"2023-07-30T17:48:36.818698491Z"}
{"caller":"plugin.go:114","level":"info","msg":"listening on Unix socket","resource":"generic-device/capture","socket":"/var/lib/kubelet/device-plugins/gdp-Z2VuZXJpYy1kZXZpY2UvY2FwdHVyZQ==-1690739316.sock","ts":"2023-07-30T17:48:36.817984061Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"generic-device/video","ts":"2023-07-30T17:48:36.819858333Z"}
{"caller":"plugin.go:174","level":"info","msg":"waiting for the gRPC server to be ready","resource":"generic-device/video","ts":"2023-07-30T17:48:36.819881907Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"generic-device/serial","ts":"2023-07-30T17:48:36.818592661Z"}
{"caller":"plugin.go:186","level":"info","msg":"the gRPC server is ready","resource":"generic-device/dri","ts":"2023-07-30T17:48:36.820697529Z"}
{"caller":"plugin.go:224","level":"info","msg":"registering plugin with kubelet","resource":"generic-device/dri","ts":"2023-07-30T17:48:36.820886697Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"generic-device/audio","ts":"2023-07-30T17:48:36.821122172Z"}
{"caller":"plugin.go:122","level":"info","msg":"starting gRPC server","resource":"generic-device/capture","ts":"2023-07-30T17:48:36.821579135Z"}
{"caller":"plugin.go:174","level":"info","msg":"waiting for the gRPC server to be ready","resource":"generic-device/capture","ts":"2023-07-30T17:48:36.821965034Z"}
{"caller":"plugin.go:186","level":"info","msg":"the gRPC server is ready","resource":"generic-device/serial","ts":"2023-07-30T17:48:36.822779514Z"}
{"caller":"plugin.go:224","level":"info","msg":"registering plugin with kubelet","resource":"generic-device/serial","ts":"2023-07-30T17:48:36.822833887Z"}
{"caller":"plugin.go:186","level":"info","msg":"the gRPC server is ready","resource":"generic-device/audio","ts":"2023-07-30T17:48:36.82299425Z"}
{"caller":"plugin.go:224","level":"info","msg":"registering plugin with kubelet","resource":"generic-device/audio","ts":"2023-07-30T17:48:36.823032272Z"}
{"caller":"plugin.go:186","level":"info","msg":"the gRPC server is ready","resource":"generic-device/fuse","ts":"2023-07-30T17:48:36.823755299Z"}
{"caller":"plugin.go:224","level":"info","msg":"registering plugin with kubelet","resource":"generic-device/fuse","ts":"2023-07-30T17:48:36.826360352Z"}
{"caller":"plugin.go:186","level":"info","msg":"the gRPC server is ready","resource":"generic-device/video","ts":"2023-07-30T17:48:36.829967489Z"}
{"caller":"plugin.go:224","level":"info","msg":"registering plugin with kubelet","resource":"generic-device/video","ts":"2023-07-30T17:48:36.830163349Z"}
{"caller":"generic.go:225","level":"info","msg":"starting listwatch","resource":"generic-device/audio","ts":"2023-07-30T17:48:36.830563465Z"}
{"caller":"generic.go:225","level":"info","msg":"starting listwatch","resource":"generic-device/fuse","ts":"2023-07-30T17:48:36.831076755Z"}
{"caller":"generic.go:225","level":"info","msg":"starting listwatch","resource":"generic-device/dri","ts":"2023-07-30T17:48:36.831287243Z"}
panic: runtime error: index out of range [0] with length 0

goroutine 144 [running]:
github.com/squat/generic-device-plugin/deviceplugin.(*GenericPlugin).discoverPath(0x0?)
    /src/deviceplugin/path.go:108 +0x9fc
github.com/squat/generic-device-plugin/deviceplugin.(*GenericPlugin).discover(0xc0002209b0)
    /src/deviceplugin/generic.go:130 +0x25
github.com/squat/generic-device-plugin/deviceplugin.(*GenericPlugin).refreshDevices(0xc0002209b0)
    /src/deviceplugin/generic.go:151 +0x55
github.com/squat/generic-device-plugin/deviceplugin.(*GenericPlugin).ListAndWatch(0xc0002209b0, 0xb43de0?, {0xc873d0, 0xc0003b71b0})
    /src/deviceplugin/generic.go:226 +0xfc
k8s.io/kubelet/pkg/apis/deviceplugin/v1beta1._DevicePlugin_ListAndWatch_Handler({0xb2f780?, 0xc0002209b0}, {0xc86220, 0xc0000e89a0})
    /go/pkg/mod/k8s.io/kubelet@v0.20.5/pkg/apis/deviceplugin/v1beta1/api.pb.go:1424 +0xd0
google.golang.org/grpc.(*Server).processStreamingRPC(0xc00029a5a0, {0xc87cb8, 0xc0002fcd00}, 0xc000292900, 0xc00007c960, 0x10bc180, 0x0)
    /go/pkg/mod/google.golang.org/grpc@v1.53.0/server.go:1620 +0x11e7
google.golang.org/grpc.(*Server).handleStream(0xc00029a5a0, {0xc87cb8, 0xc0002fcd00}, 0xc000292900, 0x0)
    /go/pkg/mod/google.golang.org/grpc@v1.53.0/server.go:1708 +0x9ea
google.golang.org/grpc.(*Server).serveStreams.func1.2()
    /go/pkg/mod/google.golang.org/grpc@v1.53.0/server.go:965 +0x98
created by google.golang.org/grpc.(*Server).serveStreams.func1
    /go/pkg/mod/google.golang.org/grpc@v1.53.0/server.go:963 +0x28a

I kinda expected the plugin to just ignore these then, but apparently not?

Expected result:


This also sparked another problem. I cannot distinguish between cards or features supported by it. A more powerful card might be able to do more and offer certain featuresets like encoders.
In fact, i actually only have 1 node with intel quicksync, but another node also has the /dev/dri/renderD128 device (one of the Oracle ARM systems), but that one just doesnt work at all. I dont think the plugin could detect that.

We could, when discovering special devices, check these more deeply and discover features and then be able to request these.

e.g. generic-device/dri-VAProfileH264High to show we discovered the VAProfileH264High encoding profile.

squat commented 1 year ago

Expected result: When one device in a group doesnt exist skip group

Yes this is absolutely the intended behavior of the plugin. The behavior you documented is a bug. It should be easy to create a reproduction case for the e2e tests to also avoid regressions.

Regarding special devices, do you have any ideas around doing feature discovery and writing feature specification in the configuration?

squat commented 1 year ago

@Ruakij I just ran some tests and was able to track down the bug. It's a one word change :) I'm going to create a test case to avoid regressions.

Ruakij commented 1 year ago

Great! Thought it was just a minor thing :)

Regarding special devices, do you have any ideas around doing feature discovery and writing feature specification in the configuration?

I am not entirely sure.

I'd also like to know which devices the system has detected. Currently there is nothing in the logs for this.
Ideally this would be machine-readable at a well-known endpoint aswell.. maybe as annotation or just an API

squat commented 1 year ago

Detect special devices, read properties, present as sub-ressources

This is the part that I'm most concerned about. I cannot imagine a way to do this generically for any random Linux device without building special logic into the plugin for each different kind of device. Maybe that's a future plugin system? Then we'd have plugins inside of plugins :p. Or can you imagine some standard syscall or file system discovery mechanism we could use to surface more information to the cluster? Ideally, the point of this all would be to surface more information that can be used for scheduling pods and allocating resources to them.

I'd also like to know which devices the system has detected

How is this different from looking at kubectl describe node X and reading the available devices that the plugin has found on that node?

Part of the point of Kubernetes device allocation is that any device with the same name should be fungible from the point of view of the pod, in other words, the exact underlying device should not matter to the pod, so surfacing more information about the concrete device shouldn't be relevant. Still, maybe we can log it for administrators to debug?

Ruakij commented 1 year ago

I cannot imagine a way to do this generically for any random Linux device

Yeah, this is only really possible using modules/plugins for these special devices to discover their capabilities.

How is this different from looking at kubectl describe node X [..]

Oh i didnt knew this already existed. But for debug reasons it might also be interesting to see on which device some discovery stopped or failed maybe.

the exact underlying device should not matter to the pod

This would be true if every device is actually the same and doesnt just "look" the same. This is partially doable in a homogenous cluster where every node has the same device (or at least a few have the exact same). The reality is, many clusters are probably heterogenous.

But you are correct, i think discovering capabilities of devices themselves might be out-of-scope and has to be managed some other way. e.g. we know node1-3 have GPUx, thus we can advertise the renderD128 device as this specific gpu-dri-device on those nodes. (could be done using multiple daemonsets and affinity-rules or just node-labels + kubemod)

KISS Principle