NVIDIA / gpu-feature-discovery

GPU plugin to the node feature discovery for Kubernetes
Apache License 2.0
287 stars 47 forks source link

Any plan to upgrade NFD helm chart to latest v0.10.1? #21

Closed anaconda2196 closed 1 year ago

anaconda2196 commented 2 years ago

Hi @klueska @elezar

I am using GFD v0.4.1 with NFD v0.6.0 (https://github.com/NVIDIA/gpu-feature-discovery/tree/master/deployments/helm/gpu-feature-discovery)

For MIG I am following - https://github.com/NVIDIA/gpu-feature-discovery#deployment-via-helm (GFD Helm chart also deploys the Node Feature Discovery (NFD) as a prerequisite Reference - https://docs.nvidia.com/datacenter/cloud-native/mig/mig-k8s.html)

Everything is working as expected.

I am facing same issue - https://github.com/kubernetes-sigs/node-feature-discovery/issues/539

I am trying to bump up NFD with v0.10.0 but looks like GFD is not compatible with that.

Do you have any plan to upgrade NFD helm chart here - https://github.com/NVIDIA/gpu-feature-discovery/tree/master/deployments/helm/gpu-feature-discovery with latest version?

klueska commented 2 years ago

Can you explain what you mean by "incompatible" with it?

I realize that the embedded NFD chart in the GFD repos is very much out of date (and should probably be updated), but we run GFD with NFD version 0.10.1 in the NVIDIA GPU Operator without issues: https://github.com/NVIDIA/gpu-operator/blob/master/deployments/gpu-operator/charts/node-feature-discovery/Chart.yaml

Have you deployed NFD v0.10.0 manually and tried to run GFD on top of it? That is to say, deploying NFD manually, and then deploying GFD with nfd.deploy=false.

elezar commented 2 years ago

@anaconda2196 were you able to solve your issue here?

elezar commented 1 year ago

@anaconda2196 since the NFD Helm chart version is now at 0.11.0. I am closing this issue.