elementary / os

The OS build system
https://elementary.io
GNU General Public License v3.0
990 stars 131 forks source link

Automatic Nvidia driver installation does not work in current Daily builds #686

Closed jlnr closed 1 year ago

jlnr commented 1 year ago

Hardware: Generic desktop box with Intel Core i7-7700, Nvidia GTX1080, ultrawide 38" Acer screen (3840x1600 75Hz).

When I boot into elementaryos-7.0-stable.20230129rc.iso, Nouveau is in use and everything uses the right resolution right from where plymouth(?) shows the animated spinner below my motherboard logo. Usually, I would then boot into my eOS installation (still using Nouveau) and install the proprietary drivers to get rid of Nouveau's slight glitchiness.

When I boot into elementaryos-7.0-daily.20230829.iso however, all I see if a black screen. Choosing the "safe graphics" options works and lets me install using 1024x768 VGA drivers, but the resulting system then also requires nomodeset and only works at 1024x768, otherwise I get the same black screen.

I chose to install proprietary drivers during installation, but I assume that doesn't include the Nvidia ones because they need to be downloaded? In any case, I can't find the Nvidia drivers in AppCenter or in the System prefs.

I am not sure if this issue is related to #546 or #324 because it only started failing so recently for me.

Is there anything I can do to see why Nouveau doesn't activate anymore?

davidmhewitt commented 1 year ago

The kernel version in the 7.0 daily ISOs is the much newer 6.x Ubuntu HWE kernel. The current stable release of 7.0 uses an older, non HWE kernel, so that would be the major difference.

If you have time to test this, could you check an Ubuntu 22.04 ISO and see if you have the same issue. If not, could you let us know if there's a difference in kernel versions between the two?

Next... could you get a full log file from the elementary OS installation with the optional drivers enabled. In theory, the proprietary nvidia drivers should be downloaded and installed if you have an internet connection during the install. The log file would tell us if that's the case or not.

jlnr commented 1 year ago

Re-trying installation of the current Daily ISO:

I wouldn't miss Nouveau if the Nvidia driver installation had worked. Having to select "safe graphics" during the initial installation is obvious enough. So I guess fixing the initial Nvidia driver installation is better because that's all I use Nouveau for anyway. I am really new to the installer/OS side of things, how can I dig into what is going on there?

jlnr commented 1 year ago

Ubuntu 22.04.3 has the same issue that only "safe graphics" works from the USB stick. I'll see if the driver installation works for them.

davidmhewitt commented 1 year ago

@jlnr Thanks for the detailed description!

I think the issue with the drivers in the installer is caused by the attempted installation of such an old version of the driver. The reason why that version fails to install post-install is probably because of trying to install such an old nvidia driver in a much newer kernel.

However, this feature in the installer uses ubuntu-drivers internally, so I'm not sure why it would be selecting such an old version.

Could you run ubuntu-drivers list on your freshly installed system and see what drivers it suggests? Equally, is the result of that command the same in a live session of the ISO?

Suspect your third issue after installing newer drivers may be from having partially installed v418 drivers from your 2nd bullet point. If you were to repeat the process, install elementary OS, and then see if you can install v535, do you get a working system?

jlnr commented 1 year ago

Before I get back to elementary, some more info from Ubuntu:

This is the list of drivers in the UI, which matches the output from ubuntu-drivers list.

ubuntu-drivers

I thought that maybe the 418-server one was meant for headless CUDA servers and that's why it's so old/stable, but I guess the "server" means something else?

Now back to the Daily ISO and its ubuntu-drivers.

jlnr commented 1 year ago

The output of ubuntu-drivers list in an elementary live session actually looks very similar. It has 418-server up to 535.

Is the issue maybe that this bit of Rust tries to install every package in the list? https://github.com/pop-os/distinst/blob/c6d65568701bf8dbc116153acb330d76b1c32f63/src/installer/steps/configure/chroot_conf.rs#L125-L132

I guess what we want is to install only the latest version (with or without -server, not sure)?

davidmhewitt commented 1 year ago

Indeed. I've just spotted that too!

It looks like there's a --recommended option for ubuntu-drivers. Does that filter the list down to only a newer one?

jlnr commented 1 year ago

Deleted my last comment, I had a typo in there 🤦‍♂️ ubuntu-drivers list --recommended works and suggests exactly v535 and nothing else.

jlnr commented 1 year ago

Thanks for the pointer to --recommended. I have adjusted the issue title because in my opinion, there's not much point worrying about Nouveau. If anything, it'd probably be better to funnel people harder into installing the proprietary drivers (e.g. by failing the installation when people attempt to do so while offline).

I cannot help much with the distinst thing right now (no Rust environment atm), so I'll try and see what happens if I install no proprietary Nvidia drivers during installation, but then install nvidia-driver-535 manually after the first boot.

davidmhewitt commented 1 year ago

I'm going to check this --recommended option on my hardware with a Broadcom NIC later, I hope that's "recommended", or else it breaks the feature for my proprietary hardware 😅

Thanks for opening the PR though, and your clean install and install nvidia-driver-535 sounds like a good test!

jlnr commented 1 year ago
jlnr commented 1 year ago

The distinst patch has been merged, but I am a bit lost as to how this change would make it into an installer ISO.

apt show libdistinst-dev says that it comes from https://ppa.launchpadcontent.net/elementary-os/os-patches/ubuntu. But the launchpad page states that the last source update is from 2017. https://github.com/elementary/os-patches doesn't seem to contain anything related to distinst either.

davidmhewitt commented 1 year ago

They get manually copied from the Pop PPA into the elementary PPAs in the Launchpad web UI.

I've just requested a sync of the latest version, status available here: https://launchpad.net/~elementary-os/+archive/ubuntu/os-patches/+packages?field.name_filter=distinst&field.status_filter=published&field.series_filter=

davidmhewitt commented 1 year ago

Looks like that worked, so the next daily iso to be built should have this included.

jlnr commented 1 year ago

Oh no...

IMG_8160

I am not sure why ubuntu-drivers list --recommended now returns two packages. Maybe I already had the header things installed last time? Anyway, it's good that this bug happened for me with just an Nvidia card, otherwise it would have failed on someone's Nvidia+Broadcom setup later on.

Time for the next distinst PR...

jlnr commented 1 year ago

Ah duh, it's because --recommended changes the format. It prints all packages separated by spaces, not one per line with metadata after a comma.

davidmhewitt commented 1 year ago

Looking at the code, it's still one driver per line, but the nvidia lines are special and can have two packages on one line: https://git.launchpad.net/ubuntu/+source/ubuntu-drivers-common/tree/ubuntu-drivers#n479

So in theory, you could have something like:

nvidia-driver-535 linux-modules-nvidia-535-generic-hwe-22.04
bcmwl-kernel-source

I can do a distinst PR to fix that up this weekend if you don't beat me to it 😉

jlnr commented 1 year ago

Thanks. My plan is to iterate over each line and then split that by spaces, so that should work. I am already installing Rust, was tempted to learn the language anyway. Let's see how it goes :)

jlnr commented 1 year ago

Hmm, so I got the string parsing to work with your example. But I'm hesitant to open my PR because I was wondering if I should install the kernel modules or not, and whether bcmwl-kernel-source wouldn't also list a module package (or the same package name twice), as the format seems to be "%s %s" regardless of whether the package is Nvidia or not.

Taking a step back, maybe it would be easier to use ubuntu-drivers install? I would expect it to do exactly the right thing without any string parsing etc.

https://git.launchpad.net/ubuntu/+source/ubuntu-drivers-common/tree/ubuntu-drivers#n159

The old Ubuntu installer (haven't checked the Flutter version) seems to be doing exactly that, just with optional settings:

https://git.launchpad.net/ubiquity/tree/scripts/simple-plugins?h=jammy#n20

What do you think?

davidmhewitt commented 1 year ago

Hmm, so I got the string parsing to work with your example.

Looks like good Rust! I'd be tempted to add a trim in there too though, just so we avoid any potential issues with leading or trailing whitespace:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=57cfd9a86966bb51cf5d20fcfddb203d

But I'm hesitant to open my PR because I was wondering if I should install the kernel modules or not, and whether bcmwl-kernel-source wouldn't also list a module package (or the same package name twice), as the format seems to be "%s %s" regardless of whether the package is Nvidia or not.

For the broadcom driver, the output is definitely just bcmwl-kernel-source, even with the --recommended flag. I assume that maybe the modules package for other drivers is just an empty string, so you just get one package on a line by itself.

Taking a step back, maybe it would be easier to use ubuntu-drivers install? I would expect it to do exactly the right thing without any string parsing etc.

I think the reason I did this originally is because distinst sets flags on apt to make sure packages can be installed from the "cdrom" (ISO). I think running ubuntu-drivers install may have a chance of not being able to use packages from the ISO because of the differences in the way we build our ISO compared to Ubuntu.

You could probably do this in a live session. If you just do a normal apt install bcmwl-kernel-source, does apt without flags pick up the packages on the "cdrom"? If not, is ubuntu-drivers doing anything clever to pass flags to apt when you do ubuntu-drivers install that would make this work?

jlnr commented 1 year ago

Running apt install bcmwl-kernel-source in a live session does install packages from cdrom://elementary.../ no matter if I am connected to Wi-Fi or not.

jlnr commented 1 year ago

The parsing fix has been merged, @davidmhewitt can you please re-sync the PPA?

jlnr commented 1 year ago

Why did you delete your comment? :) Syncing worked, and so did the driver installation using the 2023-09-10 Daily ISO. 🎉 Thanks a lot for your help.

davidmhewitt commented 1 year ago

Why did you delete your comment? :) Syncing worked, and so did the driver installation using the 2023-09-10 Daily ISO. 🎉 Thanks a lot for your help.

Ah, the GitHub UI showed it as duplicated so I thought it posted twice and I tried to delete just one 😅

Glad to hear it's working now. Thanks for your persistence in testing it and submitting the PRs to distinst!