NVIDIA / k8s-device-plugin

NVIDIA device plugin for Kubernetes
Apache License 2.0
2.45k stars 573 forks source link

Support non-standard driver installs #666

Open elezar opened 4 weeks ago

elezar commented 4 weeks ago

Thes changes add support to the NVIDIA GPU Device Plugin for driver installations that have the following properties:

In addition to the mounts of driver libraries (e.g. libnvidia-ml.so.VERSION) and device nodes that are handled by the NVIDIA Container Runtime or CDI to allow the device plugin to detect and enumerate devices, the hostDriverRoot is mounted into the device plugin at /driver-root. This allows the detection of driver files that are not required for the device plugin to fuction, but may be required by specific workloads.

From the perspective of the CDI spec generation being run in the device plugin container, all driver files are rooted at /driver-root. Furthermore, since no device nodes are detected at /driver-root/dev we assume that these are at /dev in the container (and on the host). The generated CDI specifications need to be transformed so that they are valid for container workloads started on the host. This means that occurrences of /driver-root need to be transformed to hostDriverRoot or /host/nvidia/driver as per our example.