ROCm / k8s-device-plugin

Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster
Apache License 2.0
270 stars 47 forks source link

Issues in helm chart #19

Open boniek83 opened 3 years ago

boniek83 commented 3 years ago

Helm Chart version: 0.0.2 By DaemonSet I mean k8s-device-plugin DS. DaemonSet should only create pods on nodes that have AMD hardware (should have and respect node_selector value from default values file -it doesn't at this time) DaemonSet should have resources section (to limit resource usage) DaemonSet should have priorityClassName: system-node-critical set (annotation is deprecated: https://github.com/kubernetes/website/commit/315c716774af079afa0c0479ff4513549c48a5e9/) It would be better (although not critical as long as it can be overridden) if default node_selector was compatible with how NVidias bundled NFD behaves (for clusters with mixed GPUs) and have default value of: feature.node.kubernetes.io/pci-1002.present=true Using latest tag for labeller is bad idea - this is not reproducible.

julian3xl commented 2 years ago

I've created a pull request to fix this issue https://github.com/RadeonOpenCompute/k8s-device-plugin/pull/29 that I'm suffering too.