Closed ejlee125 closed 1 year ago
I found that cause of failure was clusterrole issue in gpu-feature-discovery. After adding "create" verb in nodefeatures resources in gpu-feature-discovery clusterrole, labels for "feature.node" on node were successfully listed.
Hello, I tried to build kubernetes on MIG gpus with nvidia-device-plugin and gpu-feature-discovery. I installed two repo wih helm3 and "kubectl describe node" shows "nvidia-com:mig-~~" on Capcaity and Allocatable section. And "feature.node.kubernetes.io/cpu-" items are listed in label section also. But I can not see the label start with "nvidia.com"
And gpu-node-feature pod shows errors;
gpu-feature-discovery : 0.8.0 nvidia-device-plugin : 0.12.0
How can I fix this?