Open nevivurn opened 1 month ago
If this has no user visible change, we could even make this the default.
This does have user-visible change. If the user runs a container that sets NVIDIA_VISIBLE_DEVICES=all
and does not specify requests or limits, previously they would have access to every GPU on the node, while with the above config they would see none.
It would be useful if it were possible to customize
nvidia-container-runtime.toml
without having to build new build assets.We are using nvidia GPUs in our cluster, and we want to prevent users from accessing all GPUs on a system by setting
NVIDIA_VISIBLE_DEVICES=all
, instead requiring proper resource requests & quotas.nvidia does provide a way to do this, as documented here by setting
in
nvidia-container-runtime.toml
.Currently, there does not seem to be a way to do this without building the extensions and boot assets.