NVIDIA / data-science-stack

NVIDIA Data Science stack tools
Apache License 2.0
373 stars 57 forks source link

[rhel8] nvidia-smi not found, NVIDIA GPU Driver not installed correctly. #90

Open mattf opened 3 years ago

mattf commented 3 years ago
# simulating setup-system because install-docker currently broken
$ ./data-science-stack install-base
...
$ ./data-science-stack install-driver
...
$ reboot
...
$ ./data-science-stack diagnostics
...
###NV### Wed Feb 10 12:32:15 UTC 2021 #### Driver detected (0 means not installed): 0
###NV### Wed Feb 10 12:32:15 UTC 2021 #### NVIDIA SMI:
nvidia-smi not found, NVIDIA GPU Driver not installed correctly.
###NV### Wed Feb 10 12:32:15 UTC 2021 #### CUDA detected (0 means not installed): 0
###NV### Wed Feb 10 12:32:15 UTC 2021 #### Docker detected (0 means not installed): 0
$ lsmod | grep nvidia
nvidia_drm             57344  0
nvidia_modeset       1224704  1 nvidia_drm
nvidia              34086912  1 nvidia_modeset
drm_kms_helper        217088  1 nvidia_drm
drm                   557056  3 drm_kms_helper,nvidia_drm

nvidia-smi is part of xorg-x11-drv-nvidia-cuda (fedora; rpmfusion) and nvidia-driver-cuda (rhel; cuda-rhel from nvidia) or xorg-x11-drv-nvidia-cuda (rhel, non-free rpmfusion), not the driver, and isn't available until install-cuda.

install-cuda needs be part of setup-system if nvidia-smi is going to be part of the driver identification.

mattf commented 3 years ago

an alternative way to detect driver presence and version: modinfo -F version nvidia