Open ws909 opened 4 months ago
After updating the latest supported driver on PopOS in Pop!_Shop to 550 a few days ago,
There was no 550 update a few days ago. This is from the commit log:
550.67 was out for several months. The 555 series is the only thing that's been pushed recently.
Additional info from Mattermost:
The logs don't show any NVIDIA updates until the user manually installed 555 on 7/5. 555 has not yet been released from master staging; meanwhile, 550.67 was released on March 29th, so the timeline doesn't add up for it to be the cause: https://github.com/pop-os/repo-release/pull/328
The Linux kernel was updated to 6.9.3 on 6/25. The user's going to try booting into an older kernel to see if it behaves better.
Booted into "oldkern.conf" (Linux 6.8.0-76060800daily20240311-generic), and the behaviour was as much identical as I am able to confirm.
I updated my motherboard's UEFI to 1661 (ASUS PRIME Z790-P WIFI), and that immediately improved the situation. I was able to start up Apex Legends, and played 2 or 3 games of the Control gamemode. During this time, the flashes described in point 1 were common, although with longer and less predictable intervals. Occasionally, the monitor would lose the video signal, reporting that there was an error. Replugging the DP cable let the monitor receive a signal again. Switching between Apex Legends and the desktop, caused a lot of issues, most notably a very long loop of 5-20 black screen flashes, 5-infinite seconds video signal loss, and a rare second of video from the game. The last time I replugged the cable, I was stuck with a low framerate, and could not change this until I quit the game (could not stop the game from claiming the screen).
After the last time I replugged the DP cable, I noticed the screen tearing was back, so I enabled "Force Composition Pipeline". I then relaunched Apex Legends. The game launched as fast as it should (no longer hang at "processing shaders" like before the UEFI update), however, the fullscreen game splash screen is rendering at a very low framerate. After 10 minutes, only a few frames have rendered, and the splash screen does not even seem halfway through.
Also tested Battlefront II (2017) again, after enabling "Force Composition Pipeline", and it's behaving exactly like before the UEFI update (it sometimes froze the screen, and sometimes killed itself after a few minutes of a black window/fullscreen in the past. This time, it killed itself)
It is worth noting that the entire system seems to barely respond while a game in fullscreen is hanging like this. Application switches, bringing up other applications on top, etc, take many seconds (20+) to respond. Sometimes, the game moves out of the screen (not sure if it is minimized, at least it doesn't render to the screen), and the DE is rendered again. While the DE is being rendered, the computer responds just like normal, even though the game is running, and it can still be brought back by clicking the icon in the dock. That also makes the entire system seem to hang again.
Managed to test Apex Legends again today in fullscreen, borderless fullscreen, and windowed mode (full resolution framebuffer, but the window is displayed below the menubar and window title bar). In (borderless) fullscreen, all the same issues persist. In windowed, however, the frame rate is quite much lower (obviously), but there are 0 issues other than that.
Is the issue that the DE and fullscreen application is fighting over control of the GPU, and the GPU responding to that with "undefined" (had to) behaviour? No idea why "force composition pipeline" would have an effect on that, though. That said, that option makes absolutely no impact on games when they run in windowed mode. I still don't know how that explains it preventing Apex Legends from timely starting up, though (as explained in the first post, Steam spends forever "processing shaders" before even launching the game's launch screen window (which is always a tiny window, before entering fullscreen).
I've also noticed that vkcube --present_mode 1
(VK_PRESENT_MODE_MAILBOX_KHR) exits with "Present mode specified is not supported", despite 4080 S supporting this. However, 4080 is only listed with Windows drivers in that database, so it's not a given that the mode is supposed to be available on Linux.
Solved it. I also reported my issue on the Nvidia forums, so I'm going to quote my own reply there:
Yeah, I've located the offending settings. I may have enabled these by myself some while ago; not sure if they default to on or not. Either way I feel very stupid for spending such a long time figuring something as basic as that out, especially as I may have caused all this by enabling the offending setting in the first place.
- Nvidia X Server Settings
- X Screen 0
- OpenGL Settings:
- Allow Flipping
If this setting is enabled, I experience the behaviour I've experienced when either "Allow G-SYNC/G-SYNC Compatible" is enabled (blackouts and video output crash), or when "Force Composition Pipeline" is enabled (system freeze when a Wine game's fullscreen window is key.
I guess it's up to Nvidia, PopOS, or X Server developers to decide if this is a bug.
After updating the latest supported driver on PopOS in Pop!_Shop to 550 a few days ago, games have been practically unplayable on my computer. I am stuck with either two options:
Start up the computer, and go straight to opening a game. I will often (approximately every 30-60 seconds) get a "display signal disconnected" flash, as in, flashes of variations of black colour covering the entire screen, completely replacing the actual video signal, for 2-3 seconds. I'll also have screen tearing in the the bottom of the screen while outside of games (so in the normal desktop interface).
The usual solution to number 1, has been to open the Nvidia X Server Settings application, and turn on "Force Full Composition Pipeline" (something I have to do every time awakening the computer, since it resets, and it's impossible to save the settings). Unfortunately, toggling this to on, somehow breaks the driver from cooperating with games. Turning the setting off again does not revert to the behaviour observed in point 1; it's now stuck here. The GPU fans spin up when they are sent work from games that I start up, but the speed of execution seems to be extremely low. It seems to be spinning in some sort of lock (but not a deadlock). Starting up games goes from 2 seconds to several minutes (1,5 hours first time with Apex Legends). When games finally open a fullscreen window to begin rendering, each frame takes around 10-30 seconds to render. Game logic is processed at the same speed. Audio still plays well, for as long as the audio thread is fed work from whichever thread is held back by the graphics driver.
Tested with a variety of games, amongst them Apex Legends in Steam, and Battlefront 2 from the EA App (Lutris). On the other hand, vanilla, native Minecraft is without issues. I am not sure if that is because Minecraft's bundled binaries are native (such as GLFW), or if it's because Minecraft uses OpenGL instead of Vulkan.
I since updated to 555 (followed these steps:
sudo apt-manage add popdev:nvidia-555.58
), in the hopes my issues would go away without too much work. As far as I can tell, the behaviour is identical. Unfortunately, this means I am not sure which version of 550 was installed beforehand. I am not sure how to check that. It was the latest available one that popped up in Pop!_Shop around 3 days ago, though. It's labelled as "Transitional package for nvidia-driver-555", but that's all I can read in the user interface.Operating System: Linux-x86_64 NVIDIA Driver Version: 555.58.02 NVML Version: 12.555.58.02 Graphics Processor: NVIDIA GeForce RTX 4080 SUPER Display: Samsung Odyssey G70B (DP-0) (3840x2160), connected to one of the DisplayPort GPU ports
Operating System: Pop!_OS 22.04 LTS Windowing system: X11 Gnome version: 42.9
As an aside, my keyboard also randomly seems to get disconnected, while playing games, and sometimes even right after starting from hibernation. This happened after the latest 550 driver, too, but I can't possibly imagine a graphics driver update should have an impact on that.
I also noticed that setting "Force Full Composition Pipeline" in 555 resets the UI scaling to 1x. It's been happening with a small set of the Nvidia drivers I've had installed in the last two months. That is probably a separate issue, however.