Closed kenji-miyake closed 2 years ago
@tfoote Hello, thank you for developing this tool. Could you take a look at this PR? :pray:
Alternatively, I had a thought that maybe we should just pass all by default? It's going to load more of the driver, but I don't see that has having a significant downside, and the user could still reduce the scope by setting it manually.
@tfoote Although I'm not so familiar with CUDA specs, I personally specifying all
is acceptable considering the usage of rocker
.
But in that case, we should set all
only when it's empty. I mean, for example, compute,utility,all
is not valid.
$ docker run --rm -it --gpus all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility,graphics ubuntu:22.04
root@f4f80c3232f8:/# exit
$ docker run --rm -it --gpus all -e NVIDIA_DRIVER_CAPABILITIES=compute,utility,all ubuntu:22.04
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
unsupported capabilities found in 'compute,utility,all' (allowed 'compute,utility'): unknown.
Yeah, I think we can then simplify the logic to just set all if it's not previously set. Otherwise it will get whatever is set in the environment.
According to the documentation, the default value of
NVIDIA_DRIVER_CAPABILITIES
iscompute,utility
.Therefore,
nvidia-smi
can be used for non-NVIDIA images.However,
rocker
causes an error because it setsNVIDIA_DRIVER_CAPABILITIES
to onlygraphics
when it's empty. Our related issue: https://github.com/autowarefoundation/autoware/issues/2452This PR fixes the behavior.