NVIDIA / nvidia-container-toolkit

Build and run containers leveraging NVIDIA GPUs
Apache License 2.0
2.52k stars 274 forks source link

Binaries located at /opt/bin #683

Open denisstrizhkin opened 2 months ago

denisstrizhkin commented 2 months ago

On Gentoo Linux Nvidia binaries are located at /opt/bin.

Output of nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml:

INFO[0005] Selecting /opt/bin/nvidia-smi as /opt/bin/nvidia-smi
INFO[0005] Selecting /opt/bin/nvidia-debugdump as /opt/bin/nvidia-debugdump
INFO[0005] Selecting /opt/bin/nvidia-cuda-mps-control as /opt/bin/nvidia-cuda-mps-control
INFO[0005] Selecting /opt/bin/nvidia-cuda-mps-server as /opt/bin/nvidia-cuda-mps-server

Then inside the container one would need to run nvidia-smi as /opt/bin/nvidia-smi unless PATH is updated accordingly. Maybe all the binaries should go into /usr/bin explicitly? I think that would make sense.

elezar commented 2 months ago

@denisstrizhkin yes. At present we generally use the same path in the container as on the host, but this can be restrictive.

As a workaround, you could try and apply a CDI transform to the generated CDI spec. Something like:

nvidia-ctk cdi transform root --from /opt/bin --to /usr/bin --relative-to container --input=/etc/cdi/nvidia.yaml

You could also use:

nvidia-ctk cdi generate | nvidia-ctk cdi transform root --from /opt/bin --to /usr/bin --relative-to container --output=/etc/cdi/nvidia.yaml

since the transform command accepts input from stdin by default.