Open vanfanel opened 2 years ago
This piece of code DOES wait for actual buffer swap event:
struct wl_callback *callback;
int frame_done;
frame_done = 0;
callback = wl_surface_frame(wl->surface);
if (callback == NULL)
return;
// Issue buffer swap.
egl_swap_buffers(&wl->egl);
// The callback will set frame_done to true when receiving event.
wl_callback_add_listener(callback, &frame_listener, &frame_done);
// Stay in loop until the issued buffer swap is actually done.
while (!frame_done && wl_display_dispatch(wl->input.dpy) == 0) {}
...and this is the callback function implementation, which should go before trying to pass it to wl_callback_add_listener()
, obviously.
static void
frame_callback(void *data, struct wl_callback *callback, uint32_t serial)
{
int *done = data;
*done = 1;
wl_callback_destroy(callback);
}
static const struct wl_callback_listener frame_listener = {
frame_callback
};
Whether improves input latency on Wayland + OpenGL is something I cannot be totally sure about.
It should work with VSYNC=OFF since this needs non-blocking eglSwapBuffers()
. because we are explicitly waiting for the buffer swapping completion event.
I was going to check the branch you asked me to test on the forums, but it is gone now.
@nfp0 Yes, I did my own tests and there was no improvement to be seen. Let's simply wait for Vulkan to be fixed on Wayland.
Sigh... yeah. I'm anxiously waiting for this: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12086
But it's taking so long. 😩
@nfp0 Same here... Do you think that PR is forgotten or something? Wayland is supposed to be the future, having massive input lag with it is not a good sign.
True. Maybe we should ping the PR asking for how progress is going, if any?
I have opened this issue a few months ago: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6249
This prompted some discussion on the issue. @Themaister chimed in too back then. It sounded to me that some of the Mesa devs either thought this was a RetroArch issue or that it was not important enough for it to be a priority.
@nfp0 I ping-ed the issue. Let's hope they merge it. If 2 buffers could be forced in the driver somehow...
@vanfanel Thanks. Let's hope they wake up.
If 2 buffers could be forced in the driver somehow...
Well, I've forced it to 2 on RetroArch and that indeed gave me the lowest possible input-lag. Here's the change I did: https://github.com/libretro/RetroArch/pull/13823 With that I arrived at the same input lag values as KMS and AMDVLK.
I would use it like that, but I remember I had an issue in fullscreen mode. But that might've been an unrelated issue. Give it a try and see if you find any issues with it. I'll have to give it a try again too.
@nfp0 Thanks for the patch! Indeed, it works for forcing Vulkan to give us the specified number of buffers: with that, RetroArch on Wayland AT LONG LAST says that it's allocating 2 buffers if max_swapchain is 2. However, did you measure input lag? I don't have instruments at hand now (ie: fast camera + led-powered pad) but I feel like lag is still WAY higher on Wayland :( Super Mario World is perfect for feeling it.
@vanfanel Nice! I did measure it, yes. I arrived at the same input lag as KMS and AMDVLK from my measurement posts on the forums, which is the minimum possible latency on my setup (with frame delay disabled).
My setup was a Manjaro KDE Wayland system with RetroArch running fullscreen. I filmed my finger pressing the button on my wired keyboard at 480 fps with my OnePlus 7 Pro, which is more than enough to count the time between press and reaction on the monitor. Not the most scientific, I know, but it's good enough to count how many frames are being buffered on the PC. I did the tests with the bsnes-mercury core running the Horiz/Vert Stripes test on the 240p Test Suit, because that specific test has next-frame reaction to the input and that makes counting frames easier.
Super Mario World is perfect for feeling it.
I wouldn't go by our human feelings for this because we're already working with extremely low levels of input lag. I'm usually very picky about input lag and feel it in places where most people don't notice it at all, but I gotta admit I can't really percept the difference between 2 and 3 swapchain images. For perspective, New Super Mario Bros U and Smash Ultimate on the Switch have 5 or 6 frames of input lag.
But of course, perception is not important here. Lower is always better, and allows us to react faster.
@nfp0 Did you measure swapchain=2 vs swapchain=3? I mean, with your equipment.
No, I did not measure swapchain=3. But I can do that to make sure.
No, I did not measure swapchain=3. But I can do that to make sure.
Yes please, measure swapchain=3 and swapchain=2 and tell me what difference you see.
Sure thing. I'll try to get to it in the next few days.
@vanfanel I'm back with some numbers. Numbers that confirm swapchain=2 indeed reduces input lag by one frame. :slightly_smiling_face:
I did 10 random measurements for each scenario and averaged the value. Here are the values I've arrived at:
That's a 15ms (almost 1 frame at 60fps, the expected) difference between 2 and 3 swapchain images. 3 and 4 swapchain images have the same input lag and the difference is within margin of error, but I remember reading something on @Themaister's blog about Mesa reporting 4 images and then only using 3. If you want to validate my measurements I can upload the slow-motion vídeos to Youtube or something.
Another good new is that, from my limited testing, I found no bugs or issues while forcing the 2 swapchain images. Last time I tried this I remember having some trouble with RetroArch being frozen when I opened a game and only coming back to normal if I alt-tabbed out and then back in to RetroArch. But I don't seem to have that issue anymore. Maybe it was just a Kwin bug.
I wish we could add a feature to force the number of swapchain images, even if just as a command-line parameter. But for now, until I find any issues, I'll use it patched to force 2 swapchain images.
@nfp0 Great! Thanks for these numbers and experiments! Some questions arise in my mind: -Did you do these tests with "Threaded video" disabled? (Enabling it increases the output lag by a LOT!) -Can you please do the same tests on the TTY? (=No wayland) -Can you please do the same tests in OpenGL on Wayland? (OpenGL on Wayland won't let you chose the number of buffers, sadly).
No problem! :slightly_smiling_face:
-Did you do these tests with "Threaded video" disabled? (Enabling it increases the output lag by a LOT!)
Threaded vídeo is disabled. I've never used it to be honest.
-Can you please do the same tests on the TTY? (=No wayland)
RetroArch already uses only 2 swapchain images on KMS on the TTY, as can be seen on the console output. I've tested it back when I posted my results on the forums and already achieved the lowest possible theoretical input lag (same as Windows exclusive fullscreen). Since it already uses 2 images, my patch doesn't change anything there.
Out of curiosity, has anyone ever claimed an input lag lower than 50ms on RetroArch on any system ever? (Without using frame delay and run-ahead, of course).
-Can you please do the same tests in OpenGL on Wayland? (OpenGL on Wayland won't let you chose the number of buffers, sadly).
I don't plan on using OpenGL, but I can give it a try yeah. I assume you want Hard GPU Sync
on?
And do you want normal GL or GLCore?
Out of curiosity, has anyone ever claimed an input lag lower than 50ms on RetroArch on any system ever? (Without using frame delay and run-ahead, of course).
Not that I know. I don't use these, either. No need if having 2 buffers and NO threaded video.
I don't plan on using OpenGL, but I can give it a try yeah. I assume you want Hard GPU Sync on? And do you want normal GL or GLCore?
Can I have both, please? :)
Not that I know. I don't use these, either. No need if having 2 buffers and NO threaded video.
Well, there's always benefits in using them. Frame delay can shave up to almost 16ms if your PC is fast enough.
Can I have both, please? :)
Wait, I've remembered now that my patch only affects Vulkan. as it's an edit of gfx/common/vulkan_common.c
, so there will be absolutely no difference on OpenGL.
Do you still want me to test it?
@nfp0 Yes, please, I would like to see the lag you get for OpenGL on Wayland. I got around 65ms, which is a bit high, but then again it seems impossible to effectively control the number of buffers EGL uses (and blocking after eglSwapBuffers() shows no difference as I said). So I would like to see what you get.
Aight! I'll get back to you.
@vanfanel Sorry for the delay! I've been a bit busy, and these tests need to be done with daylight because of the slow-motion camera shutter speed and then I gotta count all the frames manually.
Ok, so, I got these numbers with the same metodology as before (10 samples):
gl: 60ms glcore: 69ms
Keep in mind there's a 16ms margin of error, depending on when my finger presses the button in relation to the Vsync interval, so the 9ms difference between them might, or might not, be real. Regardless, it seems they're slower than 2 swapchain images from Vukan, but about the same as 3 swapchain images.
So long story short, let's use Vulkan if possible. :slightly_smiling_face:
EDIT: Mind you this was on RetroArch 1.11.1
@nfp0 Thanks, really, thanks. These results confirm my feelings: Vulkan + 2 swapchains is the way to go, period. Too bad EGL API doesn't allow a way to specify the number of buffers. But well, we have Vulkan, so it's OK.
No problem! Mind you this was on a vanilla build of RetroArch. I did not apply your eglSwapBuffers()
change.
Yeah, Vulkan is the way to go for low latency. Now I just wish Mesa would hurry up, because for now, only a hacked RetroArch uses 2 swapchain images. Think we can convince libretro to have a way to force the swapchain? Even if just a command line option to keep it away from the uninformed user?
@nfp0 Well, making an option or commandline would be redundant I think: things work, we should simply get rid of asking Vulkan about the number of buffers. However, your patch did precisely that and it was rejected. So I don't know what to say. It's a pretty absurd situation if you ask me.
Because the bug is on Mesa's side. Forcing it on RetroArch works for now on our machines, but as @Themaister said, it's an invalid usage of Vulkan, which means it might not work on other setups or it might break at any update in the future. That's why I agree that it must not be available as a normal option, but I believe it would be very useful as a "force" option to work around the problem.
We don't know how many years Mesa will take to solve this, and meanwhile distros are starting to default to Wayland so we'll have more and more users with subpar input lag on RetroArch.
We don't know how many years Mesa will take to solve this, and meanwhile distros are starting to default to Wayland so we'll have more and more users with subpar input lag on RetroArch.
Problem is, most users don't care. They never played the games on real hw so they don't know how they should feel/respond. So we are a bit out of luck.
That's true unfortunately. And to top it off, the default swapchain nr is 3.
But anyway, do you know who we should talk to ask if they're ok with having this setting as a workaround?
@nfp0 I don't know people on the RetroArch organization, so I don't know who should we talk to.
Our best bet is getting @Themaister fix it on Mesa's side. Maybe he can read us?
I hope so. I'll see if I can raise some attention to this again on Discord.
@vanfanel Could this be what we were waiting for? Themaister has been working on this but I can't tell for sure if this is going to help with the swapchain issue or not. https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19279
@nfp0 Could be, yes! But I am not sure.. I don't know MESA so well under the hood. Let's wait and see what comes out of that, @Themaister is a genius: he invented the best gaming API ever and now he's on to fix the input lag problems in Vulkan on Wayland!
He sure is!
the best gaming API ever
Which one are you referring to?
Vulkan on Wayland
Actually, this problem affects both X and Wayland.
@nfp0 I was referring to the LibRetro API, which is TheMaister's creation.
LibRetro API
Oh wow! I had no idea that was also his creation. Amazing! :open_mouth:
@vanfanel Is it still an issue or can we close this issue ? Thank you.
Vulkan works as expected by now, and I added a way to block until frame swapping is done in OpenGL, but I needed that the Max Swapchain images option is displayed in Wayland, but it never happened.
@vanfanel I see that VK_KHR_present_wait has been implemented.
@vanfanel I see that VK_KHR_present_wait has been implemented.
Does RetroArch use that? I recall It was doing some workaround...
Does RetroArch use that? I recall It was doing some workaround...
I don't know.
Description
Currently, RetroArch will show noticeable input lag when using the OpenGL backend on the Wayland context (the max_swapchain setting is not available on Wayland + OpenGL).
Buffer swapping is done here for Wayland: https://github.com/libretro/RetroArch/blob/911308327dc1f06531575f9f606b21a0a25ac38a/gfx/drivers_context/wayland_ctx.c#L518 ...but it lacks a mean to block until
eglSwapBuffers()
completes.eglSwapBuffers()
is by default a synchronous/blocking function: in theory it doesn't return until the requested buffer swap is done. This behavior can be changed witheglSwapInterval(1)
, which make subsequenteglSwapBuffers()
calls return immediately.However, blocking eglSwapBuffers() as things go internally on Wayland means "you can send a new frame", not "the issued buffer swap is complete and the new contents are on screen". That's why waiting for the buffer swap event "manually" after eglSwapBuffers() can be a good idea.
In the KMS/DRM backend of SDL2, I use events to be notified on the buffer swap completion: https://github.com/libsdl-org/SDL/blob/5b2884cb0203cc63bf9753f8b55ea4c6c6f19cfb/src/video/kmsdrm/SDL_kmsdrmvideo.c#L391
So, any idea on what would be the equivalent in Wayland for explicitly blocking until the requested buffer swap is completed?
Expected behavior
RetroArch on Wayland using OpenGL should have less input lag.
Actual behavior
RetroArch on Wayland using OpenGL has noticeable input lag.
Steps to reproduce the bug
Bisect Results
Has always happened.
Version/Commit
Every version or RetroArch has this problem: as of today, there's no mechanism implemented to explicitly wait for vsync after the egl_swap_buffers() call here: https://github.com/libretro/RetroArch/blob/911308327dc1f06531575f9f606b21a0a25ac38a/gfx/drivers_context/wayland_ctx.c#L514
Environment information
@Themaister Can you please give me some input here? Are there Wayland-specific mechanisms to force wait for completion after an
eglSwapBuffers()
call?