pop-os / nvidia-graphics-drivers

Pop!_OS NVIDIA Graphics Drivers
134 stars 7 forks source link

Sporadic lag spikes after updating to Pop!_OS 22.04 #152

Open pushfoo opened 2 years ago

pushfoo commented 2 years ago

Distribution (run cat /etc/os-release):

 NAME="Pop!_OS"
VERSION="22.04 LTS"
ID=pop
ID_LIKE="ubuntu debian"
PRETTY_NAME="Pop!_OS 22.04 LTS"
VERSION_ID="22.04"
HOME_URL="https://pop.system76.com"
SUPPORT_URL="https://support.system76.com"
BUG_REPORT_URL="https://github.com/pop-os/pop/issues"
PRIVACY_POLICY_URL="https://system76.com/privacy"
VERSION_CODENAME=jammy
UBUNTU_CODENAME=jammy
LOGO=distributor-logo-pop-os

Related Application and/or Package Version (run apt policy $PACKAGE NAME): The affected user did not provide detailed information on this. I'm reporting it for them as I was considering purchasing a refurbished version of similar hardware.

Applications that seem to be mentioned in discord and elsewhere include:

  1. Steam installed through apt
  2. Native Linux games
  3. Windows games under WINE/Proton

See next section for a quote with additional information.

Issue/Bug Description: Updating to Pop!_OS 22.04 appears to have created short sporadic lag spikes when playing games:

Hello, since i updated to 22.04 a few days ago i've been having a LOT of lag in games, as in periods of 3-4 seconds when i go from the usual fps to 5-10 fps. This happens regardless of how i'm running with games, so it happens with wine games (e.g. Maniaplanet) and native games (e.g. Portal 2). These 2 worked perfectly before the update and i admit that i am completely lost on what i should do now.

The following temporary fix staves off the problem for an indeterminate amount of time through the power menu in the top bar:

  1. Set the GPU mode to integrated
  2. Reboot
  3. Set the GPU mode to dedicated.
  4. Reboot
  5. Set the Power profile to high performance
  6. Reboot if necessary

This is a chart of GPU usage from the affected user, with red underline indicating periods of dropped frame rate: image

Steps to reproduce (if you know):

  1. Use this hardware: image
  2. Update to 22.04
  3. Launch games
  4. Wait for lag spikes

Expected behavior: No lag spikes, as with previous Pop!_OS versions.

Other Notes:

Brock directed me to post this message here after I attempted to help the original user, @LinUwUxCat. I'm still not completely sure if it's a kernel issue or something to do with this repo.

The user tried changing kernel versions through bkw777/mainline, but it didn't seem fix things:

after installing https://github.com/bkw777/mainline and installing the 5.15.10 version of the linux kernel, then manually copying it to my EFI partition (/boot/efi/EFI/Pop_OS-0d3388aa-8d29-4572-9b50-dc53002b54db) and then manually adding an entry in the /boot/efi/loader/entries/, i booted on it. So far no lags, but i'll have to test a bit further tomorrow and now it's happening again, with the older kernel

So far, kernel versions tried appear to include 5.15 and 5.10.114.

This issue has also been discussed in the following locations:

  1. Discord
  2. Reddit
pushfoo commented 2 years ago

@LinUwUxCat could you also please provide information about which Pop!_OS version you upgraded from?

LinUwUxCat commented 2 years ago

Hello, I updated from 21.10, which was updated from 21.04 a few months before that.

LinUwUxCat commented 2 years ago

I can also say that the lags stopped when i switched to Hybrid graphics yesterday, and everything was fine until right now, where i started getting lags again. I don't think this is a kernel issue and i have no idea where this could come from.

pushfoo commented 2 years ago

Wasn't upgrading to 510 drivers part of the fix, or did I misunderstand the discord discussion?

LinUwUxCat commented 2 years ago

The problem was indeed fixed by upgrading the nvidia drivers from 470 to 510.

pushfoo commented 2 years ago

I'm going to leave this open for now. My understanding is that each NVidia driver release drops support for some cards, which means that some users with older GPUs may be stuck on driver versions that recreate the problem.

LinUwUxCat commented 2 years ago

Quick update, the lag spikes are still present while on NVIDIA-only mode, but not on hybrid mode (for now)

mmstick commented 2 years ago

Moving this to the NVIDIA repository since it's NVIDIA driver related.

LinUwUxCat commented 1 year ago

Hey guess what's happening again with exactly the same settings as before? Thank you, nvidia.

LinUwUxCat commented 1 year ago

friendly reminder that this is still a thing, even on linux 5.19 and with 515.48.07 drivers. I would really like my problem to be solved, but i guess it won't be.

leviport commented 1 year ago

@LinUwUxCat are you sure you aren't experiencing https://github.com/pop-os/nvidia-graphics-drivers/issues/61 instead? If you are, switching to Nvidia mode should fix it.

LinUwUxCat commented 1 year ago

@LinUwUxCat are you sure you aren't experiencing #61 instead? If you are, switching to Nvidia mode should fix it.

In the messages above:

Quick update, the lag spikes are still present while on NVIDIA-only mode, but not on hybrid mode (for now)

However, i have since then found the issue, but not how to fix it. When lag spikes happen, i observed that the gpu switches to PCIe 1.1, as if it did not detect something to work on anymore. When the lag is over, it's back to PCIe 4. If anyone has any idea on how to enforce PCIe 4, i'll welcome it. This seem to be an issue on windows as well by the way. As such, the issue probably comes from either drivers or a faulty GPU, which i'm not sure how i can test.

EDIT : here's the proof, a picture of nvtop during a lag spike. It's hard to show because the whole computer is laggy but here it is image

tristann21 commented 2 weeks ago

@LinUwUxCat are you sure you aren't experiencing #61 instead? If you are, switching to Nvidia mode should fix it.

In the messages above:

Quick update, the lag spikes are still present while on NVIDIA-only mode, but not on hybrid mode (for now)

However, i have since then found the issue, but not how to fix it. When lag spikes happen, i observed that the gpu switches to PCIe 1.1, as if it did not detect something to work on anymore. When the lag is over, it's back to PCIe 4. If anyone has any idea on how to enforce PCIe 4, i'll welcome it. This seem to be an issue on windows as well by the way. As such, the issue probably comes from either drivers or a faulty GPU, which i'm not sure how i can test.

EDIT : here's the proof, a picture of nvtop during a lag spike. It's hard to show because the whole computer is laggy but here it is image

Seem to be getting the same issue here in 2024, which makes me think this is a hardware problem. Were you able to find out what was causing this? (whether its drivers, software, or hardware related?)

Cheers!