ad8e / vsync_blurbusters

Demo of tearing at a stable-ish point
BSD Zero Clause License
9 stars 1 forks source link

Add cross-platform VSYNC estimator based on Blur Busters open source #1

Open mdrejhon opened 1 year ago

mdrejhon commented 1 year ago

@ad8e - we just released this today:

https://github.com/blurbusters/RefreshRateCalculator

Although your algorithm is good, there are some things RefreshRateCalculator.js does in a more cross-platform way (using generic time-offsets between VSYNC's to avoid the need for platform specific raster polls such as D3DKMTGetScanLine). Our algorithm could become the fallback default when the specific platform is not detected (e.g. Windows or Linux).

We released as Apache 2.0 so you may use it in your other projects that requires a vsync estimator. If you create a C++ version of RefreshRateCalculator.js please give us a pull request to add additional platforms.

ad8e commented 1 year ago

If it works, go ahead. I also tried using presentation time to get vsync times, but it always produced an extra frame of latency - on my Windows 8 system, vsync on apparently means "wait a whole extra frame for no reason". So rendering starts at the beginning of the frame, you submit the frame after 0.2 frames, and then the system presents it 1.8 frames later.

Maybe it'll be more sensible on other systems, or on Windows 11.

OpenGL has severe issues with multithreaded rendering, so blocking rendering by waiting for frame presentation would be crazy for non-indie games. Vulkan is said to work properly with multi-threaded rendering.

You could also use GPU timer objects to try to get frame presentation times, though I didn't see in my brief exploration of them how to sync them with CPU times.

On Wed, Jul 12, 2023 at 6:21 PM Mark Rejhon @.***> wrote:

We just released this today:

https://github.com/blurbusters/RefreshRateCalculator

Although your algorithm is good, there are some things RefreshRateCalculator.js does in a more cross-platform way (using generic time-offsets between VSYNC's to avoid the need for platform specific raster polls such as D3DKMTGetScanLine). Our algorithm could become the fallback default when the specific platform is not detected (e.g. Windows or Linux).

— Reply to this email directly, view it on GitHub https://github.com/ad8e/vsync_blurbusters/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEWTJKOIJIESEBTQFV7H6IDXP5ETFANCNFSM6AAAAAA2IHIO3A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mdrejhon commented 1 year ago

It works in VSYNC OFF, if one use a separate thread for the timestamps. You can use VSYNC OFF and instead use this approach already documented.

More platforms have a VSYNC listener than a raster poll. Even D3DKMTWaitForVerticalBlankEvent() has some timing jitter, and the dejitterer improves that even further, and ignores missed vsync's (e.g. from computer freezes). Then you don't need to use VSYNC ON for the visible framebuffer. Cross-platform rasters are difficult without a module similar to this. This module is fairly multipurpose (emulators, for syncing emuHz to realHz); and overlaps your purpose.

mdrejhon commented 1 year ago

Oh -- and something I discovered. Some platforms let you create two graphics contexts; e.g. an offscreen VSYNC ON frame buffer (sometimes it's sync'd to display Hz) while your visible frame buffer is VSYNC OFF. You feed the offscreen VSYNC ON to this module, which generates cross platform raster estimates for this VSYNC OFF.

This is not always reliable, but it's another hack/technique to have simultaneous VSYNC ON and VSYNC OFF in two processes / two windows / two threads. It depends on how the graphics framework decides to treat VSYNC ON on the offscreen buffer / offscreen window / etc.

No rush, just might implement it myself -- but it's fantastically "extra options" because more platforms have a VSYNC/VBI detection or listener (that can run concurrent with VSYNC OFF) than platforms having a raster poll.

This is just a "math tool" module. Doesn't have external dependencies.

ad8e commented 1 year ago

In that case, the chain for D3DKMTWaitForVerticalBlankEvent() was already implemented. You can toggle it on by going to https://github.com/ad8e/vsync_blurbusters/blob/main/platform_vsync_windows.cpp and reading the comments at the top. The math side is in vsync.cpp, which outperforms the linear regression and filter methods. It does a clever thing with pivot points to handle the unique noise distribution of waiting.

I took a look at the filter in RefreshRateCalculator.js; an exponentially-weighted linear regression would do better. It's the same as Jongerius's old algorithm, or the one in validate(), just with farther back points decreased in weight. If you really want to use a filter, might as well use a delay line instead of 4 1-unit delays, or an IIR filter, although it won't matter much.

mdrejhon commented 1 year ago

It's not for Windows version (yours is good) -- but for other platforms if additionals are implemented (e.g. Android, iOS, different Linux graphics frameworks, Mac, etc). There's a listener available for MacOS.

The filtering most certainly can be improved, so commits are welcome. The RefreshRateCalculator is identical to Jongerius' old algorithm, and Jongerius' code was easier to port to a standalone JavaScript module for TestUFO needs. So this was the least-effort open source release.

Now that being said, it's likely your vsync.cpp algorithm is better -- just more work to port C++ to JavaScript, than JavaScript-to-JavaScript. And since I needed it in year 2017, it's a variant of existing code that I refactored in year 2017

I'll add an issue tracking, to consider implementing the vsync.cpp algorithm into JavaScript and benchmark the two algorithms together simultaneously (software-calculated raster estimates versus D3DKMTGetScanline() ...) for error margins, and see which raster estimator is more accurate!

Would be fun (although I'm too busy right now to do that -- I just released a mostly unmodified module that's been closed-source for ages).

fuweichin commented 3 weeks ago

The vsync scheduler of Chromium browser works perfectly for me, although it doesn't work as expected on secondary monitor when using two-monitor two-GPU setup.

A vsync scheduler can also be used to buffer HID input messages as coalesced pointer events / wheel events, not only to schedule render loop.

Below listed are some related references:

mdrejhon commented 3 weeks ago

fuweichin commented Aug 16, 2024

Nice observations and links! Coalesced pointer events that are accurately grouped are an excellent use case. That doesn't provide subrefresh latency, however.

FYI -- for those reading -- VSYNC estimators like RefreshRateCalculator.js (and ad8e's project) doesn't have to rely on Chromiums' VSYNC generator -- any VSYNC generator can be used as a feed, and it will de-noise, de-jitter, and ignore (and smooth over) any dropped VSYNC events. Estimators like ad8e's and mine, are designed to accept a noisy VSYNC generator and output precise dejittered VSYNC timestamps.

It's also good when a VSYNC estimator outputs accuracy sufficient enough for basic beam racing purposes. For those familiar with raster interrupts and beam racing, check out www.testufo.com/raster51 ... It's, to my knowledge, the first-ever Javascript-based beam racing demo, and can do subrefresh latencies with mouse cursor.

When enabled in VSYNC OFF mode at www.testufo.com/raster51 -- look at how the JavaScript-rendered softcursor is ahead of the hardware mouse cursor! Especially at the bottom edge of the screen. Which is rather neat, software having less latency than the hardware mouse cursor.

This is not possible with coalesced pointer events (good in its own way, but not for latency). On some displays, the mouse cursor is only about ~3ms from mouse device to photons hitting your eyes, even if refresh cycles take 1/60sec = 16.67 milliseconds, due to the way VSYNC OFF interrupts the current refresh cycle scanout (with tearlines at the raster splices). Not all display pixels refresh at the same time, so the bottom edge of a display has slightly more lag than the top edge of the display, and using VSYNC OFF mode bypasses this latency effect by letting new frames interrupt the display scanout (lowest lag is the first scanline below a tearing artifact).

While indeed a higher refresh rate lowers latency too, but bypassing the refresh rate scanout lag also reduces input lag to below a refresh cycle (as evidenced by the negative lag effect of a VSYNC OFF mouse movement with software graphics) -- you can see for yourself by relaunching Chrome on a Windows machine with the command line recommended at the page, for the specific purposes of the demo.

I have long wished that there was an optional "wait for compositing" disable API built into browsers, without needing to relaunch in --disable-gpu-vsync mode.

This "Software mouse cursor having less lag than the operating systems' hardware mouse cursor" effect is why I scientifically understand why a lot of esports players tend to use VSYNC OFF mode for lower latency, despite the tearing artifact. (As a different use case from the demo link I provide, a more-niche beam racing demo).