yshui / picom

A lightweight compositor for X11 with animation support
https://picom.app/
Other
4.13k stars 589 forks source link

Poor performance when vsync is enabled #25

Closed 9ary closed 5 years ago

9ary commented 5 years ago

Platform

Arch Linux

GPU, drivers, and screen setup

Radeon RX580, amdgpu

Environment

i3-git

Compton version

Latest -git (v3-rc2-3-ga47f112 as of this writing)

Compton configuration:

backend = "glx"
paint-on-overlay = true
unredir-if-possible = true
glx-no-stencil = true
glx-no-rebind-pixmap = true
glx-swap-method = "copy"

opacity-rule = [
    "0:_NET_WM_STATE@:32a *= '_NET_WM_STATE_HIDDEN'"
];

Testing

The following vsync modes were tested, with GALLIUM_HUD=fps to have real numbers to compare:

yshui commented 5 years ago

So opengl-swc does work? I will file this under performance problems.

Also, do you still get tearing with openly vsync?

9ary commented 5 years ago

opengl and drm vsync do work for the most part but there is a consistent tear near the top of the screen that is noticeable if you really look for it. swc/mswc give me perfect vsync but terrible performance.

yshui commented 5 years ago

I am aware of the performance problem with moving windows in opengl, and am actively trying to solve it.

Although that might take a long while given the manpower we currently have or lack thereof.

9ary commented 5 years ago

It doesn't make sense though, like I said without vsync I'm seeing near 1000 fps figures. mswc is also working fine on my laptop (Intel graphics). I appreciate the effort you're putting in to fix things up, I've looked into it myself and decided it'd be less trouble to write a new compositor from scratch, but I don't have the motivation to work on it.

yshui commented 5 years ago

I've looked into it myself and decided it'd be less trouble to write a new compositor from scratch

I agree with you on that one. That's why I'm trying very hard to refactor the code so it could be easier to contribute to this project.

yshui commented 5 years ago

According to @jm33-m0, this also happens on intel card. Refer to #26 for details about his hardware and configuration.

aufkrawall commented 5 years ago

Perhaps you are experiencing this bug? https://bugs.freedesktop.org/show_bug.cgi?id=106175 It needs to be fixed in amdgpu kernel driver, as it's a general problem with xorg pageflipping + hardware cursor and amdgpu.dc.

As a workaround, you can set "amdgpu.dc=0" as kernel parameter, unless you have Vega or Raven Ridge GPUs that aren't supported by old legacy dc. It should work normally with GLX backend and any vsync mode then (imho "drm" and "opengl" have lower input latency than the others).

If you can confirm that it's the bug I linked above, please let AMD know on their bugtracker. This issue exists for a very long time now.

clapbr commented 5 years ago

The situation on this issue for me (on a RX580) is improved a lot with the latest AMD kernel https://cgit.freedesktop.org/~agd5f/linux/log/?h=drm-next-4.21-wip and mesa-git - So even if you cant or dont want go unstable the improvements might hit stable in a few months when kernels 4.20-4.21 and mesa 18.3 release.

aufkrawall commented 5 years ago

I already tried drm-next-4.21-wip a few days ago and it unfortunately didn't help with my issue.

clapbr commented 5 years ago

I already tried drm-next-4.21-wip a few days ago and it unfortunately didn't help with my issue.

can you try:

vblank_mode=0 compton --backend glx --vsync opengl

and see if thats better? I've just tested it vs a few other options and this is the only one that made my glx backend both stutterless and tearless. I've used to use xrender with vsync=opengl because glx was a bit laggy. I suspect compton is double vsyncing if I don't specify vblank_mode=0. This doesn't solve opengl-swc and opengl-mswc modes though because vblank_mode=0 will in this case disable vsync all together.

aufkrawall commented 5 years ago

Thanks for your help. Yes, I tried that in the past. It looks fine at first and also has less input lag. But: After some time I noticed that there is a very small stripe of tearing at the very top of the screen. It basically leads to the same result as turning off pageflipping in xorg config.

One other ugly workaround would be to turn off the hardware cursor. Then we can move windows without stuttering in general, but cursor will stutter when there is high system load and software cursor also has increased latency.

I think we are quite screwed until AMD can manage to fix the bug.

yshui commented 5 years ago

One other ugly workaround would be to turn off the hardware cursor.

This blows my mind. How is hardware cursor involved in this?

aufkrawall commented 5 years ago

I suppose that would need to be bisected by a kernel developer who can reproduce the issue.

The problem doesn't seem to be located in user space in any way, every fullscreen application with active hardware cursor (e.g. also some games) is affected.

clapbr commented 5 years ago

Thanks for your help. Yes, I tried that in the past. It looks fine at first and also has less input lag. But: After some time I noticed that there is a very small stripe of tearing at the very top of the screen. It basically leads to the same result as turning off pageflipping in xorg config.

One other ugly workaround would be to turn off the hardware cursor. Then we can move windows without stuttering in general, but cursor will stutter when there is high system load and software cursor also has increased latency.

I think we are quite screwed until AMD can manage to fix the bug.

I think I know which small stripe you're talking about, it indeed used to happen for me after some time specially if you open/close/open some windowed videos but in this current setup it hasnt happened for me yet. I think our best option is to make AMD fix it entirely since other compositors that used to work fine are also in a bad state (kwin comes to mind, which for me is not even as tweakable to a mostly fine performance as compton is). AMD's own Tearfree option used to be good without a compositor and is also in a bad state.

One other ugly workaround would be to turn off the hardware cursor.

This blows my mind. How is hardware cursor involved in this?

In my experience hardware cursor is related to all kinds of stutter, at least indirecly since I think if you disable it you also disable gl flips. I have some games that run like shit with hardware cursor (Diablo 3 comes to mind but havent played it for ages).

Today I was testing the new Freesync patches and hardware cursor also interacted terribly with in specific games and Freesync, making games go from smooth 75fps to ~25fps only when I moved the cursor. You arent supposed to use freesync with compositors also but I tried it for the science and compton also lags with Freesync when I move the mouse. Funnily if I play the same games under same setup with keyboard and gamepad they are smooth.

clapbr commented 5 years ago

Comment from dev about the HW cursor issue https://bugs.freedesktop.org/show_bug.cgi?id=106446#c2

clapbr commented 5 years ago

Tried this xorg conf with glx and all --vsync options and it gets SIGNIFICANTLY better. at the cost of a little bit of input lag and cursor flashing in some places. HW cursor is definetely messed up somehow

Section "Device"

Available Driver options are:-

    ### Values: <i>: integer, <f>: float, <bool>: "True"/"False",
    ### <string>: "String", <freq>: "<f> Hz/kHz/MHz",
    ### <percent>: "<f>%"
    ### [arg]: arg optional
    #Option     "Accel"                 # [<bool>]
    #Option     "SWcursor"              # [<bool>]
    #Option     "EnablePageFlip"        # [<bool>]
    #Option     "SubPixelOrder"         # [<str>]
    #Option     "ZaphodHeads"           # <str>
    #Option     "AccelMethod"           # <str>
    #Option     "DRI3"                  # [<bool>]
    #Option     "DRI"                   # <i>
    #Option     "ShadowPrimary"         # [<bool>]
    #Option     "TearFree"              # [<bool>]
    #Option     "DeleteUnusedDP12Displays"  # [<bool>]
    #Option     "VariableRefresh"       # [<bool>]
Identifier  "Card0"
Driver      "amdgpu"
BusID       "PCI:1:0:0"
Option      "DRI" "3"
**Option      "SWcursor" "true"**

EndSection

aufkrawall commented 5 years ago

Well, if you can, you should really try "amdgpu.dc=0". Compton runs completely fine for me with this old display controller driver, and so does the hardware cursor in general.

clapbr commented 5 years ago

Good news, we have a fresh kernel patch for this: https://bugs.freedesktop.org/show_bug.cgi?id=106175#c55

I am using it on my custom kernel right now and it seems to solve the issue entirely. Let's hope it get merged upstream soon.

aufkrawall commented 5 years ago

This is truly great. :)

However, I just noticed another amdgpu.dc bug which introduces stuttering in mpv when at least Compton is active. partyruined

clapbr commented 5 years ago

This is truly great. :)

However, I just noticed another amdgpu.dc bug which introduces stuttering in mpv when at least Compton is active. partyruined

Interesting, I also had mpv stuttering today on specific videos (i showing constant output frame drops), but it was unrelated to compton (but maybe related to dc=1 as you said). I solved it switching hwdec from vdpau-copy to vaapi-copy. If you want to try reproduce, this is my mpv.conf

Video example that stutters: mpv "https://www.youtube.com/watch?v=ZaD5p0IvIvk"

aufkrawall commented 5 years ago

Yeah, I need to check more thoroughly if it's really related to compositors or if they just make it a little bit worse and thus more noticeable. I already tried vaapi x11egl and software decoding + Vulkan, both stutter identically. I'm having the bad suspicion that there might be a general issue with video-sync=display-resample and atomic modesetting drivers, as there are also reports for Intel.

aufkrawall commented 5 years ago

Ok, seems to be some very specific problem of amdgpu.dc=1, xf86-video-amdgpu + Compton + mpv video-sync=display-resample + playback in fullscreen, or in short: It doesn't stutter with Gnome-Mutter.

But I found a nice workaround: TearFree works well with amdgpu.dc=1 + the patch provided in the bugtracker ticket. So I enable both (patched) amdgpu.dc=1 and TearFree, set Vsync=none in Compton config and start Compton with defined variable vblank_mode=0. -> This results in elimination of tearing, lag and also stutter in mpv. I disable TearFree via hotkey for games.

aufkrawall commented 5 years ago

Good news: I found out that --vsync=drmdoesn't suffer the stutter issue with mpv + amdgpu.dc=1, unlike the other offered vsync methods (apart from TearFree).

Thus, could you please activate it again by default for compiling, @yshui ?

clapbr commented 5 years ago

Good news: I found out that --vsync=drmdoesn't suffer the stutter issue with mpv + amdgpu.dc=1, unlike the other offered vsync methods (apart from TearFree).

Thus, could you please activate it again by default for compiling, @yshui ?

I dont had mpv problems but I was curious to see how vsync=drm performs and it seems to be very good on every scenario I tried. Def worthy including it by default unless there is a known big issue with it.

aufkrawall commented 5 years ago

Good news: I found out that --vsync=drmdoesn't suffer the stutter issue with mpv + amdgpu.dc=1, unlike the other offered vsync methods (apart from TearFree). Thus, could you please activate it again by default for compiling, @yshui ?

I dont had mpv problems but I was curious to see how vsync=drm performs and it seems to be very good on every scenario I tried. Def worthy including it by default unless there is a known big issue with it.

Uhm, now that you say it: It seems to crash Xorg each time I minimize a window with unredir-if-possible = true. :(

yshui commented 5 years ago

@aufkrawall --vsync=drm is a pretty terrible hack so I am reserved about enabling it even if it works for some users.

aufkrawall commented 5 years ago

Yeah, I wouldn't have posted it if I discovered the crash issue earlier.

Talking about vsync: Since you evaluate to restructure a lot of Compton, could you also take a look at vsync? vsync = opengl has the lowest input lag of the available modes, but it's still significantly higher than Gnome-Mutter.

yshui commented 5 years ago

There are also people reporting vsync = opengl not working for them. Honestly, I think opengl-swc is the best option in terms of compatibility. Do you have a bad experience with that option?

I'm not very sure what you mean by input lag. Do you mean the delay between input (say, key press) and the reaction? Is there a concrete way to measure that? Is it the same for all applications?

clapbr commented 5 years ago

Simplest way is dragging windows, but even by just moving the cursor I can feel more lag with opengl-swc/mswc/oml than with opengl or drm

aufkrawall commented 5 years ago

Yes, latency of render output because of vsync frame queue length. Vsync queue latency (which of course affects input latency) can be tested relatively well by dragging windows with a mouse with a proper sensor, disabled cursor acceleration (libinput flat profile) and not too high sensitivity. You can see how far the window is lagging behind the cursor when doing so.

opengl and drm have the same latency (I suspect at least 2 frames), while opengl-oml and opengl-swc feel even less direct (probably at least 3 frames delay). Gnome-Mutter feels more direct, like 1 frame delay (or definitely less than drm and opengl). When you start Compton with vblank_mode=0 and vsync = none, there is zero additional latency caused by compositing.

yshui commented 5 years ago

That is interesting. @clapbr @aufkrawall what are your drivers again?

aufkrawall commented 5 years ago

amdgpu + radeonsi of mesa for Polaris GPU (recent git versions). I think the same applies to Intel i915 + i965 of mesa.

yshui commented 5 years ago

@aufkrawall If I understand correctly, what you observe is that when you move window around using your pointer, the window will lag behind the pointer for about 2 frames?

Correct me if I'm wrong, but I think there is not such thing as a "vsync queue". If we use double buffer, the number of queued frames should be exactly 1. However, it is possible there are queued render commands in the GPU, and because it is not properly flushed when we swap buffer, causing the output to lag. (Better explanation can be found here: https://www.khronos.org/opengl/wiki/Swap_Interval#GPU_vs_CPU_synchronization)

But that is purely my guess.

So, with that in mind, I digged around in the source code, and found a undocumented option, --vsync-use-glfinish, which might help, can you give it a try.

9ary commented 5 years ago

Yeah, I can confirm the latency issues. I think it's noteworthy that vsync=drm and vsync=opengl have much lower latency, but they both show a consistent tear near the top of the screen (kinda hard to see under most circumstances). The latency with oml and swc might be related to whatever the present extension is doing. It is also visible in this minimal compositor test code. It will copy the focused window at startup. Try running it with and without vblank_mode=0 to compare, but make sure you don't have another compositor running.

clapbr commented 5 years ago

So, with that in mind, I digged around in the source code, and found a undocumented option, --vsync-use-glfinish, which might help, can you give it a try.

briefly tried it compton --vsync=opengl-swc --vsync-use-glfinish // couldnt feel a difference compton --vsync=opengl --vsync-use-glfinish // starts constant stutter/low fps compton --vsync=drm --vsync-use-glfinish // starts constant stutter/low fps

aufkrawall commented 5 years ago

@Streetwalrus

I think it's noteworthy that vsync=drm and vsync=opengl have much lower latency, but they both show a consistent tear near the top of the screen (kinda hard to see under most circumstances).

Are you sure that DRI3 is working correctly for you? At least opengl (haven't tested drm that much) definitely doesn't show any tearing here with AMD & DRI3.

@clapbr

compton --vsync=opengl-swc --vsync-use-glfinish // couldnt feel a difference

Are you sure? To me it feels like it reduces input latency to that of opengl.

compton --vsync=opengl --vsync-use-glfinish // starts constant stutter/low fps

Can confirm that.

It looks like vsync = opengl-swc doesn't show the stutter in mpv afterall, and thanks to --vsync-use-glfinish it's also usable for me regarding latency. Have to test it a bit further to be certain though.

For some weird reason, --vsync-use-glfinishmust be defined at launch and doesn't work in a config file.

9ary commented 5 years ago

Are you sure that DRI3 is working correctly for you? At least opengl (haven't tested drm that much) definitely doesn't show any tearing here with AMD & DRI3.

Absolutely certain.

aufkrawall commented 5 years ago

Absolutely certain.

Huh, weird. I once checked it on my Haswell / Broadwell (?) Atom and it seemed to work normally. That was quite some time ago though, and I guess it doesn't matter much if that vsync API isn't recommended in the first place.

9ary commented 5 years ago

It could be related to my 4k monitor. Looking at the timings, the vblank interval is relatively shorter compared to 1080p modes, so it's not unlikely that the overhead of waiting for vblank, and then performing a full screen copy on a buffer that is 4x larger misses it consistently. I'm pretty sure the Present extension is smarter than that, and copies the buffer before vsync comes in.

yshui commented 5 years ago

For some weird reason, --vsync-use-glfinish must be defined at launch and doesn't work in a config file.

@aufkrawall Yes, because it is not a config option, only a command line option. Not sure why the original developers didn't document it.

yshui commented 5 years ago

@Streetwalrus @aufkrawall I can confirm vsync = opengl indeed cause tearing on my intel graphics card.

aufkrawall commented 5 years ago

@aufkrawall Yes, because it is not a config option, only a command line option. Not sure why the original developers didn't document it.

If it turns out that it doesn't produce any trouble, perhaps it would be good to automatically enable it when at least vsync = opengl-swc is used? It definitely helps in my case.

Of course it would still be fantastic if you could manage to reduce latency even further. :)

Can vsync = opengl-mswc be considered redundant?

yshui commented 5 years ago

--vsync-use-glfinish causes compton to use 100% CPU here (nvidia driver). So I guess no, can't enable it by default. But I think I need to investigate what is happening.

yshui commented 5 years ago

Looks like nvidia driver uses busy wait for glFinish() :(

yshui commented 5 years ago

I think I might be able to do something about this. This will make the drawing logic a bit more complicated, but maybe it will be worth it.

yshui commented 5 years ago

I made a page explaining why the lag happens: https://github.com/yshui/compton/wiki/Vsync-Lag-Explained

Conclusion is the glFinish is not good enough, some other way is needed for waiting on SwapBuffers

aufkrawall commented 5 years ago

Funny, there is similar (not saying it would be related) issue with Nvidia Vulkan driver & mpv: https://github.com/mpv-player/mpv/issues/6172 It's so hurtful, I'm so glad I sold my Nvidia card...

aufkrawall commented 5 years ago

Good news: The fix for the initially reported issue will be in 4.21 kernel: https://cgit.freedesktop.org/~agd5f/linux/commit/?h=drm-next-4.21-wip&id=0e65ba74dbd61f54f2dc74035d07490d5fd99a38

However, I've noticed that vsync = opengl-swc doesn't get entirely rid of the stutter problem in mpv with amdgpu.dc=1. It's basically gone with hardware video decoding, but not with software decoding. This luckily doesn't apply to the TearFree method described above, I'm still convinced it's entirely free of stutter.

@yshui Would it be possible to implement an option to run a command when Compton enables and disables unredir? This way TearFree could be automatically enabled and disabled for fullscreen windows via xrandr.

9ary commented 5 years ago

@aufkrawall does mpv report the jitter (shift+i)? Or is it compton itself that's stuttering (run with GALLIUM_HUD=fps)?

aufkrawall commented 5 years ago

The issue doesn't exist for the performance metrics, Compton fps are stable and vsync jitter in mpv very low. It's also quite hard to perceive, it's a sporadic micro stutter (not caused by e.g. RedShift, this is another issue). The issue happens only with amdgpu.dc=1 + Compton vsync + mpv.

It doesn't happen (aka is totally smooth) with: Gnome-Mutter Compton + TearFree instead of Compton vsync mpv in fullscreen without any compositor any other application than mpv, e.g. Firefox + Compton vsync looks totally smooth amdgpu.dc=0

I suppose it's some specific weirdness of the vsync part inside the display driver.

Edit: Btw.: Why is there vsync enabled with mesa drivers, even with vsync = none? I don't think it should be normal that Compton has to be started with vblank_mode=0 in order to actually disable vsync.