NVIDIA / libnvidia-container

NVIDIA container runtime library
Apache License 2.0
816 stars 200 forks source link

Error linking when the library version on the host is lower than that in the image #213

Open NSBlink opened 1 year ago

NSBlink commented 1 year ago

When the Driver version and library version on the host are lower than the library version included in the image, after ldconfig of libnvidia-container is executed during container creation, the symlinks of the library in the container will be linked to a new version of library in image. This causes corresponding libraries to become unavailable. For example, executing nvidia-smi would result in an error: Failed to initialize NVML: Driver/library version mismatch.

root@da0fd684b11a:/lib/x86_64-linux-gnu# ls -lah | grep libnvidia-ml
lrwxrwxrwx  1 root root    26 Jul 29 07:59 libnvidia-ml.so.1 -> libnvidia-ml.so.525.105.17
-rw-r--r--  1 root root  1.8M May 12  2022 libnvidia-ml.so.470.129.06
-rw-r--r--  1 root root  1.8M Jul 25 08:32 libnvidia-ml.so.525.105.17

Here I provide a patch #214 to solve this problem by recreating symlinks for libraries related to driver versions after ldconfig execution.