pop-os / nvidia-graphics-drivers

Pop!_OS NVIDIA Graphics Drivers
134 stars 7 forks source link

NVIDIA 525.60.11 #171

Closed 13r0ck closed 1 year ago

13r0ck commented 1 year ago


mmstick commented 1 year ago

Works, but the 515 packages will disappear once we merge it, and they'll get updated to the Ubuntu version of 515 which may not support Linux 6.0. We should make a new -515 repo like we did for 470 before we merge.

13r0ck commented 1 year ago

@mmstick could we just fork master how it is now and then rename the repo? Or is there another step needed?

mmstick commented 1 year ago

Yeah we'd make a new GitHub repo and then change the remote and push master to the new repo

mmstick commented 1 year ago

The control file in the 515 repo would also have to make change like this: https://github.com/pop-os/nvidia-graphics-drivers-470/blob/master/debian/control#L1

n3m0-22 commented 1 year ago

Tested on oryp4, gaze15 Pop!_OS: 20.04, 22.04 Ubuntu: 20.04, 22.04

All testing is passing.

On Pop 20.04 the 510 driver creates a dependency issue that prevents installing the 525 driver. It can be solved with sudo apt purge ~nnvidia && sudo apt autoremove.

Note: On the gaze15 running 20.04 after install the fans were running at full speed at idle. Rebooting and Fn+1 made no difference. The system had to be powered off and the battery disconnected to reset. I was never able to reproduce this.

Desktop testing is still required.

leviport commented 1 year ago

The 4080 does not seem to want to give me any graphics with this driver. I'm currently testing on a non-Thelio machine (so the system76-power bug is not to blame, I'm pretty sure). If I boot using the iGPU instead, I do see the 4080 in Nvidia XServer Settings as well as nvidia-smi, so it's at least recognized.

I'm not sure blocking release for such a new card is desirable, but I know there is a lot of urgency to get the 4080 working.

I think I can try a 4090 tomorrow to see if that's any more cooperative.

jacobktm commented 1 year ago

I've tested this on our Major B4 with both the 4090 and the 4080. It works perfectly with the 4090, however with the 4080 I can get it to output graphics but the system is basically unusable. The system is extremely laggy, the only thing that displays smoothly is the mouse cursor. Just for a quick frame of reference I browsed to WebGL Aquarium and it's only performing at 1FPS with only 500 fish.

kylebakerio commented 1 year ago

I'm a moderately savvy linux user; am I able to help testing? I have a laptop with a 3080 using dynamic boost, and later this month will be switching to a refresh of this laptop with an A5500 ('studio'/'quadro' equivalent to a 3080ti). Happy to test this on both if some basic instructions are given or I'm pointed to them.

(X1 Extreme Gen 4, upgrading to P1 Gen 5 that's in the mail)

leviport commented 1 year ago

Sure, you can add this staging repo and install this driver with these commands:

apt-manage add popdev:nvidia-525.60.11
sudo apt update && sudo apt upgrade

When you want to remove the staging repo (once it's merged, the source will 404 when you try to update), run this:

apt-manage remove popdev-nvidia-525-60-11
leviport commented 1 year ago

Oh, actually you might have to manually install the 525 driver after adding the staging branch. 515 is a separate version, so the upgrade won't be automatic: sudo apt install nvidia-driver-515. Sorry for the misdirection.

ziprasidone146939277 commented 1 year ago

Same as @kylebakerio I want to help testing this version. I guess many people is waiting for this (like me). I have some doubts about the procedure after adding the popdev.. It's OK: I add popdev, then install 525, reboot, and remove de popdev?. So, what will happen in Pop!_Shop when then the 525 become stable?

Thank You! And sorry for my english.

System: oryxp9; RTX 3070Ti

OS: NAME="Pop!_OS" VERSION="22.04 LTS"

leviport commented 1 year ago

It's OK: I add popdev, then install 525, reboot, and remove de popdev?. So, what will happen in Pop!_Shop when then the 525 become stable?

Don't remove it unless either you want to uninstall 525, or it starts to 404 after this pull request is merged or closed.

ziprasidone146939277 commented 1 year ago

It's OK: I add popdev, then install 525, reboot, and remove de popdev?. So, what will happen in Pop!_Shop when then the 525 become stable?

Don't remove it unless either you want to uninstall 525, or it starts to 404 after this pull request is merged or closed.

Ah OK. So I add popdev (this only affects nvidia, right?) ; and then:

apt-manage add popdev:nvidia-525.60.11
sudo apt update && sudo apt upgrade


apt install nvidia-driver-515 ?

And wait to this to be closed to remove de popdev?

Currently, I have: Driver Version: 515.65.01

And last question; after install (if every goes fine) what tests do I have to do? And, if apply, report the results right here?

Thank You in advance.

kylebakerio commented 1 year ago

@ziprasidone146939277 see: https://github.com/pop-os/nvidia-graphics-drivers/blob/master/TESTING.md

ziprasidone146939277 commented 1 year ago

@ziprasidone146939277 see: https://github.com/pop-os/nvidia-graphics-drivers/blob/master/TESTING.md

OK Thank You. Unfortunately I don't have anything to test DisplayPort.

ziprasidone146939277 commented 1 year ago

Installed 525; no problems so far.

sudo apt-manage add popdev:nvidia-525.60.11
sudo apt update
sudo apt install nvidia-driver-525
ziprasidone146939277 commented 1 year ago

omg. Finally CS:GO (Steam) with "-vulkan" launch option works great now; 200 stable FPS at max. quality settings with 525 vs 90 FPS on 515 (and FPS drops). Looks like great improvement. See: https://github.com/ValveSoftware/csgo-osx-linux/issues/1477

I would like to complete the test; I do not have a monitor and cable to test de displayPort

Thank You very much

kylebakerio commented 1 year ago

K, you didn't include the install command up there, but I installed it. sudo apt install nvidia-driver-525.

I'm on an X1 Extreme Gen 4 with an RTX 3080. I have two 1080p external monitors connected through USB-C displayport. This setup works fine/great with 515.

MUX switch set to discrete when I start.

I was in NVIDIA Graphics, not hybrid, when installed.

When I reboot (still in NVIDIA Graphics mode):

I had problems with my external monitors connected through USB-C / displayport. The orientation wasn't correct for them (I use them vertically). Trying to adjust them in settings -> displays resulted in a minute or two of lag and dysfunction, maybe longer, before a dysfunctional result showed up that was usable (lots of black screen, mouse resizing while waiting). end result also was dysfunctional. trying to revert was also dysfunctional.

When I unplugged the external monitors, it was fine with native display only.

I then set it to Hybrid Graphics and switched the MUX to hybrid before booting into pop.

This worked well with external monitors immediately. My old profile that had the two external monitors rotated vertically was correctly used, and my 4k monitors was showing at 2k as it was set to in my hybrid profile.

I do notice a recurring periodic and consistent GPU spike in the background that gives me a raised eyebrow, this was with only psensor running on my desktop after a clean boot (aside from a handful of gnome extensions that I doubt are related). Nothing too crazy... just like a 10% jump every maybe 20 seconds or so? didn't look too carefully, just noticed when I left it idling while looking away from computer for a second and coming back.

Positively, xorg is no longer showing a 30+% cpu utilization at all times in the background!

Previously, when I would open discord and toggle on Advanced => Hardware Acceleration while in hybrid, I would see diminished performance and horrible/unusable window dragging stuttering performance. I don't experience that now--that seems to make no difference, so that bug seems fixed.

On the other hand, I do notice some less-severe but still present stuttering on the windows when dragging windows around in general. It's usable, but considering the horsepower in this machine, it really puts a damper on the impression of pop (even though it's nvidia's fault, presumably).

CPU utilization stays low when this is happening, btw, stays in single digits, maybe even sub 5% iirc.

Switching back to NVIDIA Graphics (but leaving mux switch to hybrid), everything looks good. Again, resolution on built-in 4k screen is set to 2k, external monitors are 1080p recognized and rotated corectly.

Unplugging and re-plugging the usb-c cable connecting the two monitors worked smoothly.

Switching to 200% seemed to work correctly. Enabling fractional scaling and scaling to 150% seemed to work correctly. However, when I think disabled fractional scaling and switched to 100%, I got a disconnected display--my main monitor was positioned such that it was not contiguous with my external two displays, which shouldn't be possible. This trapped my mouse in the external displays. Using keyboard, I was able to move the display controls from main display to externals and fix this. (I selected 'keep this config' before I realized there was a problem.)

Switching to Compute, everything seemed fine. I didn't test much, just opened katrain, which uses the gpu as a compute node, since I had it installed. It worked, printing

2022-12-03 22:23:19-0600: Found OpenCL Platform 0: NVIDIA CUDA (NVIDIA Corporation) (OpenCL 3.0 CUDA 12.0.89)

in the terminal, and functioning as normal.

I then opened stable-diffusion-ui, which happily generated 4 images in parallel at a speed comparable to the one I experienced with 515--very fast. See here, stable-diffusion-ui's output for a photo of an ai making a photo:

a_photograph_of_an_AI_making_a_photograph_Seed-8269105_Steps-25_Guidance-7 5

Switching back to nvidia, mux switch still on hybrid:

looks great

There is a bug that shows up in one of my gnome extensions when I do this (Vitals), where the spacing between the icons it adds in the top bar are now, at the end, is too much. Probably a Vitals or Gnome issue, ofc.

Restarting but only changing the mux switch, but leaving it in NVIDIA Graphics mode in pop:

run through identical test as described in section right above, and all is perfect this time! Only caveat is an issue I've had in 515 as well, which is that my built-in display only shows a 4k resolution available, no other resolutions are selectable.

if I open up Nvidia Settings, and try to set 2k for built-in there (it shows them all available), then I get almost identical behavior as 515: looks great, except when I move mouse to bottom of screen, it pans down to a glitchy dead space that seemingly corresponds to the height difference between my 1600 pixels vertical in 2k and the 1920 vertical of my external screens. :(

The only difference from 515 is that 515 shows that space as black, and 525 shows that space as some kind of broken flickering reprojection.

One bug I am seeing just now: brightness is maxed out and using hotkey and gnome bar applet both do not cause brightness to dim in this mode. :(

dragging windows is perfectly smooth (as much as my 60hz screen can manage, anyways), though.

switching mux back to hybrid (but still in nvidia mode), just confirming: dragging windows smooth, brightness works, idle temps are again about 59/60 with chrome open (with several windows and two external monitors as always), fans between 2700~ and 3k.

dragging window causes a cpu temp spike up of 10c sustained while dragging rapidly, but no lag, and no clear or obvious pattern of increased gpu load. I suspect the cpu spike is actually from the attach-window-to-edge tiling stuff I have enabled, the actual cpu usage is only a very slight bump, probably disproportionate temp spike is from a turbo boost activating.

will now reboot into pop hybrid graphics to see if I can replicate the dragging windows issue.

Yeah, in hybrid/hybrid, I can confirm that I get this dragging windows lagginess. idle temp/fan looks to be the same. I also noticed a slight visual artifact on one part of a window dragged across another for about 1/10 to 1/2 a second--the color of whatever pixels drag on top of the Psensor window get mixed in a bit with the gray of the psensor windows edge--only two specific thin bars of that window. I can get a video later. This only happens with psensor for me, so something about however it is rendering along with this driver, it seems.

This isn't by any means unusable, to be fair, but pretty sure window dragging was smooth as long as I didn't turn on discord with hardware acceleration enabled before. This is different than that.

another update: I always have external monitors connected, but just checked and realized that the window dragging lag only happens when external monitors are conencted.

kylebakerio commented 1 year ago

That is not a formal run down the full checklist on the readme, but a 'quick' first run through everything. It seems to me that my initial experience with dgpu mode + mux switch to discrete was a fluke from a fresh restart. The brightness issue is no bueno and I am 98% sure I didn't have that before, never had an issue adjusting brightness on this laptop, and I've had it for a year and used every mode on it--have gamed in linux and windows, switched mux/hybrid back and forth often for getting 2k resolution available vs. getting PCVR to work when gaming or getting red dead to play nicely on linux and windows, benchmarking, etc.

That window stutter when dragging in hybrid isn't a great sign. It is worth noting that idle gpu load when in [nvidia/mux switch nvidia] reported in psensor is around 35%, fwiw--no stutter/lag on window drag in spite of that number. idle temp is around 60c with fans at 2460rpm, which is normal, and perhaps even slightly lower than I was used to (this is with chrome on and me typing right now along with a few window open). I'm used to seeing my fans at 3.5k most of the time, and temps closer to 70.

I will continue daily driving this for now and report anything I notice (e.g., testing suspend); is here the right place for this? is this desired?

gabriele2000 commented 1 year ago

By the way, I'm using the latest nvidia driver since a week, thanks to another PPA and I can use Wayland just fine. No diagonal tearing in Hybrid mode, even in Nvidia-only mode af far as I remember from the tests I make days ago.

I'm currently on Wayland. Gaming with wayland translates in a little performance issue: when there's a concrete CPU load, the GPU output can freeze a bit and let's not forget that a freeze in wayland is "death" since you can't restart GNOME.

Alas, it works better.

leviport commented 1 year ago

Looks like the packaging sync fixed 4080 :tada:

We should re-run some quick tests to make sure there weren't any side effects.

n3m0-22 commented 1 year ago

I'm seeing no regressions on my previous testing with this change. Looks good to me.

leviport commented 1 year ago

Excellent, I think I'm satisfied then. I did some testing with the 4080, the 4090, and a 3060Ti. I also installed it on my Oryx. I'm seeing no problems.

13r0ck commented 1 year ago

@mmstick does this look correct to you pop-os/nvidia-graphics-drivers-515 ?

@leviport please make sure https://github.com/pop-os/nvidia-graphics-drivers/pull/171#issuecomment-1329686241 works before we release either

mmstick commented 1 year ago

The debian/control and debian/changelog will have to be updated to refer to its source package as nvidia-graphics-driver-515

mwt commented 1 year ago

For those who installed the test repo, the command to remove is sudo apt-manage remove popdev-nvidia-525-60-11