pop-os / nvidia-graphics-drivers

Pop!_OS NVIDIA Graphics Drivers
134 stars 7 forks source link

NVIDIA: 535.54.03 #184

Closed 13r0ck closed 11 months ago

ziprasidone146939277 commented 1 year ago

Release Highlights Interesting release..a lot of bug fixes and improvements.

XV-02 commented 1 year ago

On Oryp11, running sudo apt update doesn't register any available updates with this branch, despite running a 525 driver.

13r0ck commented 1 year ago

That sounds like the transitional packaging might be a bit wonky. I will look into it. Should be able to install the 535 package directly though.

leviport commented 1 year ago

Looks like there's some trouble with some DKMS stuff

Loading new nvidia-535.54.03 DKMS files...
Building for 6.2.6-76060206-generic
Building for architecture x86_64
Building initial module for 6.2.6-76060206-generic
ERROR (dkms apport): kernel package linux-headers-6.2.6-76060206-generic is not supported
Error!  Patch buildfix_kernel_6.0.patch as specified in dkms.conf cannot be
found in /var/lib/dkms/nvidia/535.54.03/build/patches/.
dpkg: error processing package nvidia-dkms-535 (--configure):
 installed nvidia-dkms-535 package post-installation script subprocess returned error exit status 5
Setting up libnvidia-decode-535:amd64 (535.54.03-1pop0~1686941847~22.04~b0f9ffb) ...
Setting up libnvidia-decode-535:i386 (535.54.03-1pop0~1686941847~22.04~b0f9ffb) ...
Setting up xserver-xorg-video-nvidia-535 (535.54.03-1pop0~1686941847~22.04~b0f9ffb) ...
dpkg: dependency problems prevent configuration of nvidia-driver-535:
 nvidia-driver-535 depends on nvidia-dkms-535 (>= 535.54.03); however:
  Package nvidia-dkms-535 is not configured yet.

dpkg: error processing package nvidia-driver-535 (--configure):
 dependency problems - leaving unconfigured
Setting up libnvidia-encode-535:amd64 (535.54.03-1pop0~1686941847~22.04~b0f9ffb) ...
No apport report written because the error message indicates its a followup error from a previous failure.
13r0ck commented 1 year ago

ah I removed the 6.2 kernel patch from this. I wanted to test if nvidia supported it natively yet... apparently they don't yet...

mmstick commented 1 year ago

@leviport This should be installable now. I'm currently in cosmic-comp with the 535 driver functioning.

andyczerwonka commented 1 year ago

👍🏻 for this PR, hoping it gets in soon as I would like to try it with Wayland. I'm hoping Alacritty finally runs given the update to the Wayland protocol.

mmstick commented 1 year ago

@andyczerwonka Alacritty works on cosmic-comp with this driver

andyczerwonka commented 1 year ago

@mmstick I'm running Ubuntu 20.04 on my new serw13 laptop, not sure if I can run cosmic-comp?

ahoneybun commented 1 year ago

@mmstick I'm running Ubuntu 20.04 on my new serw13 laptop, not sure if I can run cosmic-comp?

Only 22.04 has been tested on that hardware so it might have issues.

andyczerwonka commented 1 year ago

@ahoneybun That was a typo - it's 22.04, not 20.04.

ahoneybun commented 1 year ago

@andyczerwonka currently cosmic-session and other COSMIC packages are in a Pop repo so you would need to add that which can cause issues.

mmstick commented 1 year ago

@andyczerwonka It should work in GNOME, but you may have to make some configuration changes: https://www.reddit.com/r/pop_os/comments/iyh890/guide_enabling_wayland_on_nvidia/

XV-02 commented 1 year ago

Still running into issues during install. Last line of terminal output:

update-initramfs: deferring update (trigger activated)

Had it hang in the same spot thrice now.

mmstick commented 1 year ago

@XV-02 Does it hang if you run update-initramfs -u -v? Or give any additional details? I installed the driver on a system that had the 6.3.7 kernel installed, and I had purged the nvidia driver beforehand. I can try reverting and doing a transition from the current driver and kernel.

XV-02 commented 1 year ago

@XV-02 Does it hang if you run update-initramfs -u -v? Or give any additional details? I installed the driver on a system that had the 6.3.7 kernel installed, and I had purged the nvidia driver beforehand. I can try reverting and doing a transition from the current driver and kernel.

Let me try before you purge/revert your system.

XV-02 commented 1 year ago

Okay, update-initramfs -u -v processed without errors. Additionally, installing the 6.3.7. kernel, purging the old drivers, and installing 535 worked. So I'll work back from there to find what mix of methods causes the issue, and hopefully to narrow down the specific issue.

ahoneybun commented 1 year ago

This seems to remove the option to use Wayland or COSMIC for me at least with the 6.2.6 kernel. Changing back to the current 525 driver fixes it.

mmstick commented 1 year ago

@ahoneybun Does it work if you install the 6.3.7 kernel from here? https://github.com/pop-os/linux/pull/262

ahoneybun commented 1 year ago

I'll test that, currently when upgrading just the NVIDIA driver causes my displays to go black which makes it difficult to tell if it finished so I can reboot.

ahoneybun commented 1 year ago

@mmstick looks like it was a false alarm as it is showing up now.

EDIT: Looks like COSMIC does not launch though but I do have the option.

mmstick commented 1 year ago

I reverted the driver to 525 and then did the transitional upgrade with the 6.3.7 kernel still installed. I'm not seeing any issues post-upgrade with Hybrid or NVIDIA graphics mode, so COSMIC is still working fine with 535 on this system.

ahoneybun commented 1 year ago

@mmstick it looks like the 6.3.7 repo is no longer there at least from https://github.com/pop-os/linux/pull/262

mmstick commented 1 year ago

I'll switch to 6.2.6 and see if that affects anything

mmstick commented 1 year ago

NVIDIA and Hybrid graphics modes are still functioning in COSMIC with the 535 driver after reverting to Linux 6.2.6

ahoneybun commented 1 year ago

I only have a Thelio to test, it could be an issue with this older GTX 1050 Ti that it uses. I have COMSIC as an option but it just shows the decrypt screen. I can test a Pop Wayland session as well to see if it is just a COSMIC issue not a Wayland issue.

mmstick commented 1 year ago

Perhaps. I'm currently on a system with a RTX 3050 Ti Mobile.

ahoneybun commented 1 year ago

From a non-COSMIC and non-Wayland standpoint with default X11/Xorg this driver seems to be working fine for me and I have not seen any issues after upgrading.

jacobktm commented 1 year ago

I've tested this driver in a Mira-R3 with the 6.2.6 kernel and an RTX 4060 Ti. Installing the driver does have an issue, the system enters an unusable state immediately after installing the driver, I can'teven get to other TTY terminals, but I can reboot the system with sysrq commands. However, after rebooting the driver appears to work fine, at least in the base pop desktop environment, and performance looks good when I run some game benchmarks.

andyczerwonka commented 1 year ago

@jacobktm Have you tried Wayland using that driver? Or is all your testing under X?

XV-02 commented 1 year ago

Okay, regardless of kernel, if I haven't purged the previous driver install, I see the hang behaviour. But the hang isn't actually a system freeze. I ran script to capture the output from the install, and while the display shows the install freezing at 52%, the capture shows it hitting at least 75%, and if I wait long enough and reboot (using REISUB) it looks like the nvidia driver installs correctly. It's as if the display becomes wholly unresponsive during the install.

In fact, if I install in a TTY, the process finishes successfully, though the screen does blank afterwards.

mmstick commented 1 year ago

So the issue is that the actively-running driver crashes if its files aren't removed from the system before the new driver's packages are installed. Then there must be a packaging issue somewhere in one of the package scripts.

mmstick commented 1 year ago

@XV-02 I've just pushed changes to our packaging which fixes the display server crash while upgrading the driver. The 525 driver crashes if the system services are started at any point during the driver upgrade.

The Pop Shop should also now display a notification requesting to reboot after installing or upgrading the driver. So those with automatic updates will get notified that it's necessary to restart.

Changes in the Ubuntu packaging also helped resolve this. Stopping services alone didn't resolve the issue without syncing with Ubuntu's packaging. And neither did Ubuntu's packaging alone resolve it without stopping and disabling the services during the upgrade.

This also adds packaging rules for the nvidia open driver, since it's now included in the NVIDIA driver installer, and Ubuntu added the package rules for it.

XV-02 commented 12 months ago

Just as a note: This driver update does not address issues around boot on hyrid hardware.

Fixing this is outside the scope of this PR, but just wanted to note it while I'm testing.

XV-02 commented 12 months ago

While testing steam games in integrated mode with this driver installed, I'm seeing significant screen flicker. Not sure what to check on for that.

XV-02 commented 12 months ago

To extend my observation:

While testing 535 on Oryp11, running in integrated mode saw some severe screen flicker in multiple Steam games - including both native Linux and proton enabled games. This behaviour is not present in dedicated or hybrid graphics modes.

This represents a clear regression compared with the 525 driver.

I'd also be worried about the frequency of flashing this induces in otherwise non-flashing games and its potential to adversely affect users with flash-sensitive medical conditions.

mmstick commented 12 months ago

Interesting that there's a difference in integrated mode. This mode blacklists the nouveau and nvidia driver modules, so the driver should have no effect there. https://github.com/pop-os/system76-power/blob/master/src/graphics.rs#L41

You could check with cat /proc/modules | grep nvidia to see if the driver is loaded for some reason.

XV-02 commented 11 months ago

I'm not seeing evidence that the nvidia driver is loaded, so that seems to be working as we'd expect.

reedlove commented 11 months ago

Hi everyone. I'm so sorry to just jump in here like this. I just wanted to let you guys know that there is a clear performance regression even in 525.125.06-0ubuntu0.22.04.1 running on Linux Mint. There's a "ticking/hitching" that occurs at regular intervals. It's very apparent when playing a relatively modern 3D game. When using Mangohud, you can see it on the frame graph as well. I also experienced the same problem on Arch Linux running the latest 535 driver package. In contrast though, your 525.116.04-1pop0~1686770941~22.04~ac14717 package runs smoothly without issue.

mmstick commented 11 months ago

@reedlove Have you tried with this driver on Pop!_OS?

reedlove commented 11 months ago

@reedlove Have you tried with this driver on Pop!_OS?

I'll try it right now and let you know in a minute.

reedlove commented 11 months ago

Yup. 535.54.03 on an up-to-date install of Pop!_OS stutters horribly compared to 525.116.04. My reliable test is using "Ori and the Blind Forest: Definitive Edition" on Steam Version: 1689034492, using Proton 8.0-2 or Proton 7.0-6. RTX A2000 6GB. It's bizarre because I don't see it with all vulkan games/apps, but I do see it when using Proton, regardless of the version. Let me know what else I can do to help get you more information.

P.S. My Linux Mint 21 machine is running a GeForce RTX 3060 Laptop chip and has the same stutter issue on 535.

mmstick commented 11 months ago

@reedlove May be worth reporting the issues to NVIDIA's developer forum for Linux. https://forums.developer.nvidia.com/c/gpu-graphics/linux/148

reedlove commented 11 months ago

@reedlove May be worth reporting the issues to NVIDIA's developer forum for Linux. https://forums.developer.nvidia.com/c/gpu-graphics/linux/148

I was thinking about it. I literally just threw together a new media center machine and noticed that the performance was waaaaayyyy worse than it should be for the hardware inside of it.

andyczerwonka commented 11 months ago

I was thinking the same thing with this new serw13 laptop

leviport commented 11 months ago

I can run some comparative benchmarks and see if I can get some numbers.

gabriele2000 commented 11 months ago

I'll just chime in and say that I've read somewhere on that-lemmy-clone that there's an issue related to a shader-caching issue.

Basically DXVK recreates the shaders everytime you open a game, and it's something really stupid. I just can't drop v535 and go back to v525 though: since a recent commit of DXVK, v525 is unsupported.

leviport commented 11 months ago

For Deus Ex: Mankind Divided, I averaged about 0.6 fps lower on this 535 version. This was on a 4060 serw13. That's a native Linux title though, so I'll probably have to try a Proton game as well.

leviport commented 11 months ago

I actually saw a small improvement on this driver in Arkham Knight: Screenshot from 2023-07-17 12-00-28

XV-02 commented 11 months ago

Okay, well, I'm now seeing the same flickering on 525 as well in the same situation, so I'm no longer inclined to consider it a regression. It is frustrating, regardless, but is only present in integrated mode. In hybrid and dedicated modes, it's behaving. I want to test the lab 3060 with this, and see if the stutter issues reported earlier in this PR are present, and try and quantify that though.