pop-os / nvidia-graphics-drivers

Pop!_OS NVIDIA Graphics Drivers
141 stars 8 forks source link

NVidia driver on Pop!_OS 19.10 broke on 440 update #32

Closed fedoraJ closed 4 years ago

fedoraJ commented 4 years ago

More specifically it looks like the driver is in some weird mixed state between 435 and 440.

I noticed some graphical issues on my Oryx Pro (oryp3, Nvidia GTX 1060) and ran nvidia-smi in the command line. I got the message "Failed to initialize NVML: Driver/Library version mismatch". Running dmesg I saw "NVRM: API mismatch: the client has the version 440.31, but this kernel module has the version 435.21. Please make sure that this kernel module and all NVIDIA driver components have the same version."

I have attempted to roll back to the 435 driver however, looking at the packages by apt list nvidia-*-435 it looks like it using the 440 files and not the 435 ones. I have also tried reinstalling the 440 drivers and purging the nvidia drivers and installing fresh. So far nothing has worked.

mmstick commented 4 years ago

Make sure that you've removed all NVIDIA packages

sudo purge '*nvidia*'
sudo apt install nvidia-driver-440
fedoraJ commented 4 years ago

I have purged all nvidia packages a few times with no luck. I just tried again to be sure. Still no change. The kernel is still holding on to the 435.21 module.

mmstick commented 4 years ago

Did you reboot after installing 440?

fedoraJ commented 4 years ago

Yes. Whenever I mess with the graphics driver I always reboot to multiuser mode so the driver is not in use at the time. I rebooted after I purged all nvidia files. Then I reboot back into graphical mode after I installed the 440 driver.

mmstick commented 4 years ago

There's really no need to drop down to a specific mode. You can purge all NVIDIA packages from the running system, install the new nvidia-driver-440, and then reboot. The initramfs will have been updated with the new driver in that process.

fedoraJ commented 4 years ago

Back when I used to install the drivers with Nvidia's own installer on Fedora the installer would refuse to run while X.org was running.

Anyway, that's not relevant here. I did another purge and reinstall while running in graphical mode still no change.

mmstick commented 4 years ago

Are you using a custom kernel? Any changes from sudo update-initramfs -c -k all? If you have the 440 driver installed, it's impossible to be booting into the old driver, unless your initramfs isn't being generated for the kernel you're booting into.

fedoraJ commented 4 years ago

I am not using a custom kernel. Running the command shows it generating the initrd for kernel 5.3.0-23-generic and 5.3.0-19-generic. 5.3.0-19-generic is the one it is currently booting to.

fedoraJ commented 4 years ago

I got my laptop to reboot into the 5.3.0-23 kernel. The 440 driver is running fine with this kernel. I have set it to be the default boot item in systemd-boot.

hugglesfox commented 4 years ago

I have also had a similar issue. It appears that the system76-driver-nvidia package depends on both nvidia-driver-440 and nvidia-driver-435 which is creating this situation where both the drivers are installed.

fedoraJ commented 4 years ago

A quick update since the Nvidia driver was updated a little bit ago. I don’t know what the effect was on my laptop yet as I haven’t tried to reboot into the Pop OS kernel. The Ubuntu one is still working fine. I did install Pop on my main desktop machine. A fresh install of Pop OS with the 440.31 driver was working just fine, the update however killed it. Same message of driver/library mismatch between 440.31 and 440.44. I rebooted into a different kernel (5.3.0-24) that is simply labeled “Ubuntu” and the 440.44 driver is working fine.

I suspect this is a stock Ubuntu kernel that was installed when I purged Gnome from my system and installed the Kubuntu (KDE Plasma) desktop. (Yes, I dislike Gnome THAT much.)

mmstick commented 4 years ago

@fedoraJ You need to ensure that you've fully upgraded, and then run sudo update-initramfs -c -k all to have an initramfs generated for each installed kernel.

fedoraJ commented 4 years ago

That didn’t work for my laptop and it doesn’t work with my desktop either. When I run updates, I run them from the command line. (sudo apt update && sudo apt upgrade -y && sudo apt autoremove -y) None of the updates have failed. Running the update-initramfs -c -k also runs with no issues.

update-initramfs: Generating /boot/initrd.img-5.3.0-24-generic update-initramfs: Generating /boot/initrd.img-5.3.0-23-generic update-initramfs: Generating /boot/initrd.img-5.3.0-22-generic

I see it is updating 3 different kernels. They are all labeled Ubuntu.

 OS:..................Ubuntu 19.10 Root partition:....../dev/sda3 Root FS UUID:........5af9d45d-b5fb-46e7-bbf4-591e7982dff3 ESP Path:............/boot/efi ESP Partition:......./dev/sda2 ESP Partition #:.....2 NVRAM entry #:.......-1 Boot Variable #:.....0000 Kernel Boot Options:.quiet loglevel=0 systemd.show_status=false splash Kernel Image Path:.../boot/vmlinuz Initrd Image Path:.../boot/initrd.img Force-overwrite:.....False

I think the version numbers on the kernels threw me off. I doesn’t look like the Pop OS kernel is getting updated. Could the Ubuntu kernel that was installed with the Kubuntu-Desktop meta package somehow be causing an issue where update-initramfs is unable to see the Pop OS kernel?

mmstick commented 4 years ago

Installing the desktop metapackage from another Linux distribution will override all of Pop's settings. This is why we recommend installing desktop environments, rather than entire derivative desktop packaging. Desktop metapackages were never meant to be installed alongside each other, since the general assumption is that you have only one desktop metapackage that you've opted into, serving as the one true source for all system defaults.

So, you should definitely remove the kubuntu-desktop metapackage. The best way to do this is:

sudo apt-mark minimize-manual
sudo apt purge kubuntu-desktop
sudo apt install kde-full
sudo apt autoremove

Then it's a good idea to reinstall Pop's packaging

sudo apt install --reinstall pop-desktop pop-default-settings

You should have linux-system76 installed.

sudo apt install linux-system76
fedoraJ commented 4 years ago

I wasn’t aware the install instructions for KDE Plasma had changed. I was acting on older instructions from System76’s own website.

Attempting to run those commands on my laptop did not go well. Linux-system76 would not install due to dependency issues. I ended up blowing away my entire install and reinstalling Pop OS. After purging Gnome again and installing KDE with the kde-full instead of the kubuntu-desktop. I noticed there is still an Ubuntu kernel installed. I do not know if it was installed with KDE or another package.

There is a waiting update for linux-generic(5.3.0.24.28) that requires removing linux-system76 due to a conflict. I have not run that update yet.

fedoraJ commented 4 years ago

This is clearly not an issue with the NVIDIA driver. The issue with the driver is merely symptom of a different issue with Pop OS it’s self.

samrose commented 4 years ago

Make sure that you've removed all NVIDIA packages

sudo purge '*nvidia*'
sudo apt install nvidia-driver-440

fwiw this worked for me

kruegernet commented 3 years ago

Make sure that you've removed all NVIDIA packages

sudo purge '*nvidia*'
sudo apt install nvidia-driver-440

fwiw this worked for me

This doesn't work now since all prior versions point to 455.

Can one of the Pop devs please respond with why and how this came to be? Is it a necessary compatibility constraint with other components of the distro?

allentje commented 3 years ago

@kruegernet can confirm it does not work. Always updated to the 455 update. Tried to rollback to 435 and then it started to unpack and install 455.