NVIDIA / gpu-operator

NVIDIA GPU Operator creates, configures, and manages GPUs in Kubernetes
https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html
Apache License 2.0
1.86k stars 299 forks source link

Add validate nouveau whether in blacklist #974

Open lengrongfu opened 2 months ago

lengrongfu commented 2 months ago

When i use preinstalled drivers are on host. other components may fail because user did not disable nouveau driver.

cdesiniotis commented 1 month ago

Hi @lengrongfu, how do you propose we validate that nouveau is blacklisted?

lengrongfu commented 1 month ago

We can exec lsmod | grep nouveau in driver validating container to check wheather result. if nouveau not in blacklisted, we can stop the subsequent process.

https://github.com/NVIDIA/k8s-driver-manager/issues/37 As discussed in this issue, since blacklisting would require updating the initramfs and rebooting the node, so wen can to check nouveau already been added to the blacklist?