hyprwm / Hyprland

Hyprland is a highly customizable dynamic tiling Wayland compositor that doesn't sacrifice on its looks.
https://hyprland.org
BSD 3-Clause "New" or "Revised" License
17.96k stars 754 forks source link

nvidia_anti_flicker = false significantly increases CPU and GPU utilization + unuseable lag after opening special #4294

Open TZProgrammer opened 6 months ago

TZProgrammer commented 6 months ago

Hyprland Version

Hyprland, built from branch main at commit 78f9ba9fdd7e258747b862ca2ae7d4cf5335f4ed dirty (makefile: add symbolic link for lowercase binary name). Date: Fri Dec 29 04:37:58 2023 Tag: v0.33.1-113-g78f9ba9f

Bug or Regression?

Bug

Description

I toggled nvidia_anti_flicker to false to fix low framerate. Now, GPU temps are up, Hyprland utilization went from averaging 1% of CPU to 4%, and when I open special, the mouse lags and becomes virtually unusable.

GPU temps, which usually stay at around the low 40's, climb to 60's, and the fan starts to speed up to counteract the high temps audibly.

How to reproduce

Toggle nvidia_anti_flicker to false, open special workspace. On my system at least, there is a lot of lag both with opening special, and after the special is open, mostly noticeable on the mouse movement.

Crash reports, logs, images, videos

image

The beginning of the CPU utilization graph is when nvidia_anti_flicker is toggled false. Soon after, I toggle it to true and the CPU load goes down (Given the GPU temp difference, I imagine GPU load is also going down).

vaxerski commented 6 months ago

do you have decoration:blur:special enabled? it's very expensive.

TZProgrammer commented 6 months ago

Yes, I do. I will disable it and let you know if it improves. Just one second, I am finishing a test of power consumption between the anti-flicker being off and off.

TZProgrammer commented 6 months ago

Just in case this information is in any way useful:

These are the GPU stats with nvidia_anti_flicker = false. Low GPU temps, and only 6 Watts being used by the GPU (RTX 2070 Mobile): image

These are the GPU stats with nvidia_anti_flicker = true. Temps have risen over 20 degrees Celcius, and power consumption has been multiplied by 5 times. image

TZProgrammer commented 6 months ago

With blur turned off, the special no longer lags while loading. However, my GPU power consumption and temps have not improved.

vaxerski commented 6 months ago

I am seeing that you have translucent windows with blur - blur on special is equally expensive - either turn it off or enable xray.

TZProgrammer commented 6 months ago

Currently, I have disabled many of the effects. This is the script I ran: image

The only effect I am seeing that is still in effect is translucent windows (no blur, however).

GPU stats are somehow worse than before, although I have not opened any new apps, just passed my mouse between monitors.

image

vaxerski commented 6 months ago

I mean if they are translucent hyprland needs to pump out 2x as many pixels with special open, sounds quite normal, no? Your gpu is an nvidia mobile, not only mobile, but nvidia..

TZProgrammer commented 6 months ago

I think the increased power usage in the last screenshot might be because autocpufreq turned off powersaving mode due to the high load, so not related to turning off the effects.

TZProgrammer commented 6 months ago

I can see why the load might be high in the first place due to the effects, but I still don't understand why it would increase this dramatically only after turning anti-flickering off. All of these effects were present when anti-flicker = true, so should the power draw not be high then as well? From what I can tell, the only difference in the code is that glFlush is called instead of glFinish, but why would that 5x the power consumption?

vaxerski commented 6 months ago

likely because nvidia

TZProgrammer commented 6 months ago

Fair, lol.

By default my integrated GPU should be used though, so I still don't understand why it seems like the Nvidia card is being relied on so strongly now.

vaxerski commented 6 months ago

shouldnt be. you can specify what card is used for hl with WLR_DRM_DEVICES

TZProgrammer commented 6 months ago

It is already set. However, when toggling nvidia_anti_flicker, it starts being used to great extent.

This is what I have in my hyprland.conf: env = WLR_DRM_DEVICES,/dev/dri/card0:/dev/dri/card1

And these are my nvidia env vars: env = GBM_BACKEND,nvidia-drm env = LIBVA_DRIVER_NAME,nvidia env = __GLX_VENDOR_LIBRARY_NAME,nvidia env = __GL_VRR_ALLOWED,1

vaxerski commented 6 months ago

no clue. Nvidia moment I guess.

TZProgrammer commented 6 months ago

Yeah :(

TZProgrammer commented 6 months ago

One more thing that might be relevant. This is related to #4240.

When nvidia_anti_flicker = true, I get 60 fps on vkmark. Meanwhile, if nvidia_anti_flicker = false, I get 1132 fps on vkmark.

I am wondering if when nvidia_anti_flicker = false there is no cap in FPS, and then Hyprland's FPS would far exceed the refresh rate of the monitor. I apologize for my ignorance, but could this be the reason for the increased power draw?

Image for vkmark runs: image

vaxerski commented 6 months ago

possible, but nvidia_anti_flicker is basically what the nvidia patches were, so I'll leave it on by default. For the vast majority they fix flicker and terrible artifacts. Blame nvidia.

abmantis commented 5 months ago

I also have sort of the same issue.

With nvidia_anti_flicker = true: the whole desktop feels a bit laggy, even the mouse, and glxgears runs at ~30fps. If a video is playing on youtube on the foreground glxgears goes up to 60fps.

With nvidia_anti_flicker = false: the desktop is smooth overall and glxgears runs at 60, but some apps have delayed rendering except when there is input (keypress or mouse). This is very noticeable on terminals, where their output sometimes does not update until I press another key or move the mouse. If glxgears is running this delay is gone.

I would prefer to have nvidia_anti_flicker = false, if it wasn't for the rendering delays on some apps.

By the way, some extra info on glFlush vs glFinish:

My laptop has an Intel and Nvidia RTX 3050 Ti Mobile. The Nvidia is the one connected to the external ports, so unfortunately I cannot just turn it of.

UPDATE: setting debug:damage_tracking = 0 and nvidia_anti_flicker = false makes everything smooth and working great, so it looks like a problem with damage tracking?

eingrid commented 2 months ago

I noticed that when I disabled 'nvidia_anti_flicker' and set it to 'false', I experienced significant input lag on my second monitor in certain applications like the terminal or rofi. However, after adding 'debug:damage_tracking = 0' to the hyprland config, everything seems to be smooth now. Much appreciated!

tchofy commented 2 months ago

@eingrid I'd advise against using that as a solution unless you know what you're doing. Disabling damage tracking forces your gpu to re-render your entire screen(s) as many times as your refresh rate, even when you're not using it.

abmantis commented 2 months ago

I now use the following patch to render 10 extra frames, which seems to solve all the missing frame issues I had, without the full impact that disabling damage tracking has:

diff --git a/src/helpers/Monitor.hpp b/src/helpers/Monitor.hpp
index c08cdea4..16423aba 100644
--- a/src/helpers/Monitor.hpp
+++ b/src/helpers/Monitor.hpp
@@ -73,6 +73,8 @@ class CMonitor {

     CMonitorState   state;

+    int extraRenderFrames = 0;
+
     // WLR stuff
     wlr_damage_ring         damage;
     wlr_output*             output          = nullptr;
diff --git a/src/render/Renderer.cpp b/src/render/Renderer.cpp
index 863279ac..8c565d64 100644
--- a/src/render/Renderer.cpp
+++ b/src/render/Renderer.cpp
@@ -1084,6 +1084,13 @@ void CHyprRenderer::renderMonitor(CMonitor* pMonitor) {
     // check the damage
     bool hasChanged = pMonitor->output->needs_frame || pixman_region32_not_empty(&pMonitor->damage.current);

+    if (hasChanged) {
+        pMonitor->extraRenderFrames = 10;
+    } else if (pMonitor->extraRenderFrames > 0) {
+        pMonitor->extraRenderFrames -= 1;
+        hasChanged = true;
+    }
+
     if (!hasChanged && **PDAMAGETRACKINGMODE != DAMAGE_TRACKING_NONE && pMonitor->forceFullFrames == 0 && damageBlinkCleanup == 0)
         return;

It may require some changes to apply to the current master branch, but shouldn't be too hard.

vaxerski commented 2 months ago

feel free to make a MR. 10 seems a lot though.

abmantis commented 2 months ago

feel free to make a MR. 10 seems a lot though.

Sure, I will do. I never did because when I posted about it in discord I got the feeling it would not make sense since this is a workaround for Nvidia AFAIK.

At 6 I still got some small artifacts from time to time, so ended up at 10 just to be safe.

abmantis commented 2 months ago

@vaxerski should this be put behind a setting?

vaxerski commented 2 months ago

probably

thejch commented 2 months ago

since this comes with a (small?) performance hit and most people don't need it, if it would be a pr it should default to 0 with the number of frames configurable IMO

abmantis commented 1 month ago

I'll wait to test out 555 drivers before opening a PR, since people are reporting a lot of improvements with those.