NVIDIA / nvidia-container-toolkit

Build and run containers leveraging NVIDIA GPUs
Apache License 2.0
2.47k stars 266 forks source link

Red Hat: CUDA Dirver or Device not Found In Docker #178

Open Ikkyu321 opened 11 months ago

Ikkyu321 commented 11 months ago

1. Issue or feature description

systerm: Red Hat Enterprise Linux release 9.3 host cuda driver: Driver Version: 535.113.01 CUDA Version: 12.2 image docker version: nvcr.io/nvidia/pytorch 23.08-py3 Problem: cuda driver not detected and nvidia-smi command not found in docker image

I have installed the nvidia cuda toolkit according to the official documentation, but it still doesn't work.

curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo yum install -y nvidia-container-toolkit

Can anyone help?Thanks.

elezar commented 11 months ago

Hi @Ikkyu321 as presented in your output, the system you're on uses docker as an alias for podman. Please see our documentation on using the NVIDIA Container Toolkit with Podman.

Note that this assumes a relatively recent Podman version, so confirming that would be idea.

elezar commented 9 months ago

Also see #210 for rootless Podman specifically.