NVIDIA / gpu-operator

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html
Apache License 2.0
1.83k stars 297 forks source link

The installation of GPU Operator version v24.6.1 fails to install the /usr/local/nvidia/toolkit #925

Open coderRenxy opened 2 months ago

coderRenxy commented 2 months ago

1. Quick Debug Information

2. Issue or feature description

  1. When installing the GPU Operator version v24.6.1, the nvidia-toolkit fails to install. However, installing version v24.3.0 successfully installs the toolkit.
  2. Customization options are not producing the desired effects.

3. Steps to reproduce the issue

  1. helm install --wait --generate-name \
    -n gpu-operator --create-namespace \
    nvidia/gpu-operator
  2. helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gpu-operator --set operator.defaultRuntime=containerd --set driver.enabled=false --set driver.nvidiaDriverCRD.enabled=true --version v24.3.0 
cdesiniotis commented 2 months ago

@coderRenxy please provide relevant logs and more details on the issue you are encountering.