negativo17 / nvidia-kmod-common

NVIDIA's proprietary driver kernel module common files
4 stars 5 forks source link

Issue with recent change to `kernel-open` driver #13

Closed davidmezzetti closed 2 months ago

davidmezzetti commented 2 months ago

Hello - Long time user here. Thank you for all your work on this project and getting NVIDIA drivers to work on Fedora. As an open-source maintainer myself, I know you never really do know who is using your work....unless they have an issue. So before that, I just want to say thank you :smile:

I have an older laptop with a NVIDIA GeForce GTX 1060. With the latest driver, after reboot, I get the following message:

kernel: NVRM: The NVIDIA GPU installed in this system is not supported by open#012NVRM: nvidia.ko because it does not include the required GPU#012NVRM: System Processor (GSP).#012NVRM: Please see the 'Open Linux Kernel Modules' and 'GSP#012NVRM: Firmware' sections in the driver README, available on#012NVRM: the Linux graphics driver download page at#012NVRM: www.nvidia.com.

I traced this to this recent change.

From looking around, it appears the open drivers only support Turing architecture and forward. I was able to rebuild the drivers using the proprietary variant as per these instructions: https://negativo17.org/nvidia-proprietary-and-open-source-kernel-modules/. Everything is working after that.

I have another card that is new but I believe I'd still prefer the proprietary drivers for now even in that case as I do a lot of CUDA/GPU work. So on that machine I also rebuilt using the proprietary drivers.

It appears that every time this package is updated, it's going to overwrite /etc/nvidia/kernel.conf and save the current one to /etc/nvidia/kernel.conf.rpmsave?

Would it make sense to instead save the new config as kernel.conf.rpmnew in cases the user has modified the file? I'm not that familiar with the package build process but I believe I've seen this pattern before. I suspect I'm not the only one who will run into this.

scaronni commented 2 months ago

Hi, no it's just this time. The configuration file in the package is marked as a configuration file:

https://github.com/negativo17/nvidia-kmod-common/blob/master/nvidia-kmod-common.spec#L88

With release 560 of the drivers, the open/close modules have feature parity and from now on any new development will happen on the open drivers only. The Nvidia installer (runfile) and the official CUDA repository installs open modules as well by default.

As such, I've switched as well. This means that now that you customized the configuration file, RPM knows it, and for any new update you will get a /etc/nvidia/kernel.conf.rpmnew as any other RPM package that ships configuration files while the content will be preserved as is.

If your hardware supports it (not the case for the GTX 1060) you should definitely use the open kernel modules: https://download.nvidia.com/XFree86/Linux-x86_64/560.31.02/README/kernel_open.html

Cheers!

davidmezzetti commented 2 months ago

Thank you for the quick feedback and background.

I'll update my kernel.conf next time I upgrade the drivers for the newer machine!