Open frantic-from-paranoiawf opened 1 week ago
Closing this as it has been solved. From what it seems (I don't know why), Kernel-open modules work on vendor firmware with CUDA, but you will need proprietary modules for CUDA on Dasharo. That might not be the case, but it's what fixed it for me.
If CUDA doesn't work with open drivers it is still valid issue. From the coreboot matrix channel I saw that it work on any drivers with MSI firmware.
I'm getting the same error using Qubes OS, this command solves the problem for me.
sudo nvidia-smi --id 000:00:01.0 --persistence-mode 1
Component
Dasharo firmware
Device
MSI Pro Z790-P
Dasharo version
Latest dasharo branch
Dasharo Tools Suite version
No response
Test case ID
No response
Brief summary
Nvidia CUDA with Torch does not work on MSI boards. It fails to initialize. Display does work though.
How reproducible
100% of the time.
How to reproduce
Expected behavior
Torch + CUDA program starts with no errors, can use CUDA for deep-learning successfully.
Actual behavior
Torch CUDA program returns the following error:
Screenshots
Additional context
Option ROMs loading in UEFI is enabled. AI program is oobabooga Using Gentoo Linux (all dependencies installed, contained, and ran inside of a Python venv, Gentoo isn't the cause) Display output and CUDA works perfectly and quickly on vendor firmware X11 is being used
Solutions you've tried
Loading Nvidia driver explicitly first in
xorg.conf
Disabling Option ROMs loading in firmware Recloning oobabooga and redownloading all dependencies Enabling resizeable bars inside of firmware