Open Interpause opened 1 year ago
--gpus all
should enable the same runtime behavior but need to confirm with the nvidia-container-runtime engineers. Thanks for the heads up.
Has this been changed? If I use --gpus instead of --runtime in run_dev.sh, you will get an error like this.
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
@Buddies-as-you-know , could you confirm what version of the CUDA Drivers you have installed? The missing libnvidia-ml.so.1
library should be included as part of a proper CUDA installation.
https://github.com/NVIDIA-ISAAC-ROS/isaac_ros_common/blob/6d3c5c00e0e2b3fc1d75eb4286848d23b05d6dca/scripts/run_dev.sh#L195-L203
I noticed
run_dev.sh
's Docker container works (torch.cuda.is_available()
returns True) if I replace--runtime nvidia
with--gpus all
. I also noticed in the dev environment setup guide (https://nvidia-isaac-ros.github.io/getting_started/dev_env_setup.html) thatnvidia-container-runtime
is deprecated. Is using--gpus all
more suitable on newer versions of Docker?