Closed artem-zinnatullin closed 1 year ago
@artem-zinnatullin thanks for reporting this. Can you please share any relevant logs from the kubelet on the affected node at the time of the errors?
One thing that occurs to me is that the issue might arrise because this project is not strictly observing the initialization order required by Kubernetes. Strictly speaking, the gRPC server for the plugin must be started before the plugin registers itself with the Kubelet, however, this plugin does both concurrently [0]. This could be causing the issue and would not be difficult to correct.
[0] https://github.com/squat/generic-device-plugin/blob/main/deviceplugin/plugin.go#L115-L160
hi @artem-zinnatullin can you please update to the latest version of the plugin now that #35 is merged? I wonder if this might have any effect on the UnexpectedAdmissionError message you are seeing. If not, then we will need to see logs from your Kubelet and scheduler before we can proceed any futrher.
FWIW, I see similar messages on clusters running the NVIDIA device plugin. I wonder to what extent this is common to device plugins.
Oh, one last question: what version of k8s are you runnnig?
closing for now. please re-open if you think there is a problem with the device plugin or you need more help!
Hi!
I use generic-device-plugin to expose Zigbee USB stick to Zigbee2Mqtt
It works and I'm able to match the node with
squat.ai/zigbee: 1
in a Deployment with replicas1
However if node (it runs both as K8S controller and K8S worker) restarts I start seeing many instances of that pod despite it being ran as
Deployment
withreplicas: 1
inUnexpectedAdmissionError
statekubectl describe pod
gives this:My understanding is that at the time K8S tries to run my Deployment the
generic-device-plugin
hasn't started yet and when it does there is some race condition and K8S tries to spin up many pods with access to same device and only one pod succeeds and others fail intoUnexpectedAdmissionError
I wonder if there is any solution to this?