microsoft / azurelinux

Linux OS for Azure 1P services and edge appliances
MIT License
4.31k stars 549 forks source link

cuda install with kernel-rt for azl 3.0 #10782

Open ankithmr opened 1 month ago

ankithmr commented 1 month ago

I am using Azure linux 3.0 with RT kernel for example 6.6.35.1-rt34-1.azl3. However, cuda installs a different kernel and nvidia-smi command only runs with that.

Error with RT kernel:

afoedge@insedge-68 [ ~ ]$ nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

If I boot the OS with RT kernel, nvidia-smi fails. Do you have a cuda rpm that corresponds to RT kernel ?

I also tried the the CM2 version but the nvidia-open driver fails to install:

https://github.com/microsoft/azurelinux/blob/3.0/toolkit/docs/nvidia/nvidia.md

root [ /opt/kfo ]# sudo tdnf -y install nvidia-open
Loaded plugin: tdnfrepogpgcheck
1. package nvidia-open-560.35.03-1.noarch requires nvidia-driver-cuda >= 560.35.03, but none of the providers can be installed
Found 1 problem(s) while resolving
Error(1301) : Solv general runtime error
root [ /opt/kfo ]#
ankithmr commented 1 month ago

Can you please update on this ?