Open inodentry opened 2 years ago
There is a great Unity article about this in conjunction with pipelined rendering and their very long hunt to straighten it out.
Awesome, glad the community is taking this seriously. Trying to chase down these kinds of stutter issues at the application level can leave one feeling extremely haunted.
I thought I ran into this type of jitter at a visible level in a toy Bevy project where the debug build was running at opt-level 1. Measuring exactly what's going on with this sort of thing is currently a bit beyond my abilities, so I can't say for certain what I was seeing, but as an experiment I tried adding a system to PreUpdate that did a rolling average of delta times (inspired by this ancient post from the Our Machinery folks), and then using the resulting smoothed delta for all my movement simulation. That actually seemed to help a great deal, which is at least some evidence that inaccurate delta was the source of the stutter I was seeing.
Later, I ripped that out and just flipped to opt-level 2 for debug profile, which also smoothed out the worst of the stutter. So maybe the inaccuracy gets amplified when runtime performance gets worse, or maybe I was off-base entirely; definitely not a zone where I'm confident in my judgements yet.
I ran into this when implementing bevy_framepace
. The solution I found was to track time in RenderStage::Cleanup
. While not perfect, I found that this ended up giving me the best perceived results as it is almost entirely dependent on when the frame is sent to the GPU, and there is very little scheduling variability to contend with. I agree that having presentation timings would be the ideal solution.
This seems to be related to https://github.com/bevyengine/bevy/issues/4691. Since we're setting time in First
the timing gets delayed when there are more inputs to process or winit has a weird delay.
RenderStage::Cleanup
won't work in this case because we need to access the main world to write the time. Seems like we either need to set the time after RenderStage::Cleanup
or somehow before winit events start getting processed. Once we have pipelined rendering that could be in the Extract
stage.
Feels like it's not very possible to run things before winit events, but if we wanted to, we could fix things a bit in the current model by adding another schedule that is run after sub apps are run.
Long and detailed thread on Discord about this issue can be found here. I've done my best to make sure the discussion is reproduced here / in #4728, but if future readers ever want to really dig into the history, that's where it is.
I'm not sure if that's the same, but I get this level of jitter even in simple "move_sprite" example.
hardware: MacBook Pro 2020 (i5)
Long and detailed thread on Discord about this issue can be found here. I've done my best to make sure the discussion is reproduced here / in #4728, but if future readers ever want to really dig into the history, that's where it is.
What is the title of the thread? I use Discord apps and clicking the link doesn't open them on mobile nor desktop.
EDIT: "Timing, stutter, hitching, and jank" in #help.
Yes, the time jitter is not from the "complexity" of your project, but rather the way Bevy updates its time values. The problem exists even on a minimal blank example app.
bevy/examples/2d on main [?] via ⚙️ v1.62.1 took 4m41s
75% ➜ cargo run --example move_sprite --features wayland --release
I still see sprite moving not smoothly(
Software: Fedora 36 with latest updates Hardware: Ryzen 5 3500U, Vega 8 graphics, 12Gb 2400MHz dual-channel memory.
Jitter still exists in simplest 2d sprite moving demo, even with full screen mode.
system:Windows gpu:RTX3060 cpu:i7-8700k
Likewise on my macbook, linux machine and built for the web
This issue snowballs badly due to the lack of proper frame lock and while having something like freesync or g-sync.
With low CPU load framerate peaks above monitor refresh rate, then goes lower, then back higher again. Never reaching actual refresh rate. Which causes massive variation of DT and that also results in major stutter.
To have production ready engine this has to be solved, combined with:
Right now its really hard to reach smooth gameplay on a decent machine. If this issue is not possible to solve right now - consider implementing something like a smoothed DT over time. While it is incorrect and will make simulation framerate dependent, it will at least make the issue less visible for the end users.
While experimenting with different settings, I've stumbled upon Godot suggestions how to handle jitter / stutter which is platform dependent. E.g., see Window platform.
https://docs.godotengine.org/en/stable/tutorials/rendering/jitter_stutter.html
So, I've tried setting WindowMode::BorderlessFullscreen for the WindowPlugin and compared that to Windowed / Fullscreen.
BorderlessFullscreen provide much better experience of them all on Win 11.
There's also a weird input lag while using Windowed + 60hz setup on the 144hz monitor. Not sure if its related or not, but same behaviour is not seen when just stress loading game to <60 FPS.
Also, there's this suggestion, which kind of makes sense:
If your monitor supports it, consider enabling variable refresh rate (G-Sync/FreeSync) while leaving V-Sync enabled, then cap the framerate in the project settings to a slightly lower value than your monitor's maximum refresh rate as per this page. For example, on a 144 Hz monitor, you can set the project's framerate cap to 141. This may be counterintuitive at first, but capping the FPS below the maximum refresh rate range ensures that the OS never has to wait for vertical blanking to finish. This leads to similar input lag as V-Sync disabled with the same framerate cap (usually less than 1 ms greater), but without any tearing.
Bevy caps FPS higher than 144 for the 144hz G-Sync / FreeSync monitors, which may lead to unstable framerate spiral downwards and back again to something like 148. Not sure what causes this with PresentMode::AutoVsync though.
Unfortunately, there's no way to custom limit the FPS without workaround hacks like loading CPU with workload, so there's no way to test this idea properly.
For example, on a 144 Hz monitor, you can set the project's framerate cap to 141. This may be counterintuitive at first, but capping the FPS below the maximum refresh rate range ensures that the OS never has to wait for vertical blanking to finish. This leads to similar input lag as V-Sync disabled with the same framerate cap (usually less than 1 ms greater), but without any tearing.
This is effectively what bevy_framepace
does when the framerate limit is set to auto. It finds the display refresh rate, and rounds down. This ensures that the CPU is never overproducing frames, and the queue length never exceeds 1. The downside of the approach is it will occasionally starve the queue, because you are slightly underproducing frames, but this will manifest as a tear or a duplicated frame.
The plugin works using spin_sleep
to get better-than-OS sleep accuracy, without actually spinning and wasting CPU cycles - it can signal to the OS that the thread is in a spin loop, and it will use almost no power. This is really useful for both latency and power use reduction.
Bevy version: 0.7 and current
main
.Bevy's
Time
resource, which is the standard and recommended source of timing information in Bevy, that everything should use for anything timing-related, is inaccurate in typical game situations.The issue occurs, because
Time
is updated in a system at the beginning of theFirst
stage in the main app schedule, usingstd
time.This might be fine for a pure cpu simulation use case, where the app just runs as fast as possible on the CPU. However, it is inadequate for games.
The exact instant when Time is updated depends on when the time update system happens to be scheduled to run. It is not tied to actual rendering frame timings in any way.
For animation use cases (this includes most practical uses of delta time! moving objects, moving the camera …), really anything where the timing information is needed to control what is to be displayed on the screen, the
Time
resource will not accurately represent the actual frame timings. This could lead to artifacts, like jitter or hiccups.Usage of "fixed timestep" is also affected, as it derives its fixed updates from
Time
.The issue can be observed if we enable vsync (which should lock Bevy to the display refresh rate and result in identical timings every frame), and print the value of
time.delta()
every frame update. The result is something like this:You can see that these timings vary by at least a few hundred microseconds every frame, sometimes even as much as ~1ms from frame to frame. This is, i guess, small enough that nobody has raised an issue in Bevy yet :D … but definitely large enough to risk causing real issues with animation/physics/etc.
The real solution to this problem would be to use frame presentation timings (time reported by the OS/driver corresponding to when the rendered frame is actually sent to the screen), which requires support from the underlying graphics APIs.
wgpu
does not yet provide anything for this use case. This is understandable, as there is no standard API in Vulkan yet either. AFAIK, only Android, and maybe recent versions of DirectX, and i think some Mesa extensions on linux, provide such functionality.Relevant work in Vulkan: https://github.com/KhronosGroup/Vulkan-Docs/pull/1364
In the meantime, we should explore ways to improve the accuracy of
Time
in any way we can. TheFirst
stage does not seem like the best place to do it.Maybe a value that is much closer to the true frame timings (and likely good enough for most use cases) could be obtained from the Render schedule somehow? Maybe in the Prepare stage when the swapchain texture is obtained (as this is where the "vsync wait" happens)?
Please discuss.
Related issue: #3768