bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust
https://bevyengine.org
Apache License 2.0
36.17k stars 3.57k forks source link

Res<Time> is unreliable / jittery #4669

Open inodentry opened 2 years ago

inodentry commented 2 years ago

Bevy version: 0.7 and current main.


Bevy's Time resource, which is the standard and recommended source of timing information in Bevy, that everything should use for anything timing-related, is inaccurate in typical game situations.

The issue occurs, because Time is updated in a system at the beginning of the First stage in the main app schedule, using std time.

This might be fine for a pure cpu simulation use case, where the app just runs as fast as possible on the CPU. However, it is inadequate for games.

The exact instant when Time is updated depends on when the time update system happens to be scheduled to run. It is not tied to actual rendering frame timings in any way.

For animation use cases (this includes most practical uses of delta time! moving objects, moving the camera …), really anything where the timing information is needed to control what is to be displayed on the screen, the Time resource will not accurately represent the actual frame timings. This could lead to artifacts, like jitter or hiccups.

Usage of "fixed timestep" is also affected, as it derives its fixed updates from Time.


The issue can be observed if we enable vsync (which should lock Bevy to the display refresh rate and result in identical timings every frame), and print the value of time.delta() every frame update. The result is something like this:

...
16.688ms
16.560ms
16.807ms
16.487ms
16.811ms
16.635ms
17.327ms
15.999ms
16.980ms
16.151ms
16.895ms
16.637ms
16.681ms
16.628ms
16.629ms
16.734ms
16.739ms
16.501ms
16.731ms
16.765ms
16.711ms
16.528ms
16.544ms
16.921ms
16.561ms
16.530ms
16.732ms
16.699ms
...

You can see that these timings vary by at least a few hundred microseconds every frame, sometimes even as much as ~1ms from frame to frame. This is, i guess, small enough that nobody has raised an issue in Bevy yet :D … but definitely large enough to risk causing real issues with animation/physics/etc.


The real solution to this problem would be to use frame presentation timings (time reported by the OS/driver corresponding to when the rendered frame is actually sent to the screen), which requires support from the underlying graphics APIs. wgpu does not yet provide anything for this use case. This is understandable, as there is no standard API in Vulkan yet either. AFAIK, only Android, and maybe recent versions of DirectX, and i think some Mesa extensions on linux, provide such functionality.

Relevant work in Vulkan: https://github.com/KhronosGroup/Vulkan-Docs/pull/1364

In the meantime, we should explore ways to improve the accuracy of Time in any way we can. The First stage does not seem like the best place to do it.

Maybe a value that is much closer to the true frame timings (and likely good enough for most use cases) could be obtained from the Render schedule somehow? Maybe in the Prepare stage when the swapchain texture is obtained (as this is where the "vsync wait" happens)?

Please discuss.

Related issue: #3768

superdump commented 2 years ago

There is a great Unity article about this in conjunction with pipelined rendering and their very long hunt to straighten it out.

https://blog.unity.com/technology/fixing-time-deltatime-in-unity-2020-2-for-smoother-gameplay-what-did-it-take

nfagerlund commented 2 years ago

Awesome, glad the community is taking this seriously. Trying to chase down these kinds of stutter issues at the application level can leave one feeling extremely haunted.

I thought I ran into this type of jitter at a visible level in a toy Bevy project where the debug build was running at opt-level 1. Measuring exactly what's going on with this sort of thing is currently a bit beyond my abilities, so I can't say for certain what I was seeing, but as an experiment I tried adding a system to PreUpdate that did a rolling average of delta times (inspired by this ancient post from the Our Machinery folks), and then using the resulting smoothed delta for all my movement simulation. That actually seemed to help a great deal, which is at least some evidence that inaccurate delta was the source of the stutter I was seeing.

Later, I ripped that out and just flipped to opt-level 2 for debug profile, which also smoothed out the worst of the stutter. So maybe the inaccuracy gets amplified when runtime performance gets worse, or maybe I was off-base entirely; definitely not a zone where I'm confident in my judgements yet.

aevyrie commented 2 years ago

I ran into this when implementing bevy_framepace. The solution I found was to track time in RenderStage::Cleanup. While not perfect, I found that this ended up giving me the best perceived results as it is almost entirely dependent on when the frame is sent to the GPU, and there is very little scheduling variability to contend with. I agree that having presentation timings would be the ideal solution.

hymm commented 2 years ago

This seems to be related to https://github.com/bevyengine/bevy/issues/4691. Since we're setting time in First the timing gets delayed when there are more inputs to process or winit has a weird delay.

RenderStage::Cleanup won't work in this case because we need to access the main world to write the time. Seems like we either need to set the time after RenderStage::Cleanup or somehow before winit events start getting processed. Once we have pipelined rendering that could be in the Extract stage.

Feels like it's not very possible to run things before winit events, but if we wanted to, we could fix things a bit in the current model by adding another schedule that is run after sub apps are run.

alice-i-cecile commented 2 years ago

Long and detailed thread on Discord about this issue can be found here. I've done my best to make sure the discussion is reproduced here / in #4728, but if future readers ever want to really dig into the history, that's where it is.

adsick commented 2 years ago
image

I'm not sure if that's the same, but I get this level of jitter even in simple "move_sprite" example.

hardware: MacBook Pro 2020 (i5)

superdump commented 2 years ago

Long and detailed thread on Discord about this issue can be found here. I've done my best to make sure the discussion is reproduced here / in #4728, but if future readers ever want to really dig into the history, that's where it is.

What is the title of the thread? I use Discord apps and clicking the link doesn't open them on mobile nor desktop.

EDIT: "Timing, stutter, hitching, and jank" in #help.

inodentry commented 2 years ago

Yes, the time jitter is not from the "complexity" of your project, but rather the way Bevy updates its time values. The problem exists even on a minimal blank example app.

adsick commented 2 years ago
bevy/examples/2d on main [?] via ⚙️ v1.62.1 took 4m41s
75% ➜ cargo run --example move_sprite --features wayland --release

I still see sprite moving not smoothly(

Software: Fedora 36 with latest updates Hardware: Ryzen 5 3500U, Vega 8 graphics, 12Gb 2400MHz dual-channel memory.

qpshot commented 1 year ago

Jitter still exists in simplest 2d sprite moving demo, even with full screen mode.

system:Windows gpu:RTX3060 cpu:i7-8700k

smpurkis commented 11 months ago

Likewise on my macbook, linux machine and built for the web

VergilUa commented 8 months ago

This issue snowballs badly due to the lack of proper frame lock and while having something like freesync or g-sync.

With low CPU load framerate peaks above monitor refresh rate, then goes lower, then back higher again. Never reaching actual refresh rate. Which causes massive variation of DT and that also results in major stutter.

To have production ready engine this has to be solved, combined with:

Right now its really hard to reach smooth gameplay on a decent machine. If this issue is not possible to solve right now - consider implementing something like a smoothed DT over time. While it is incorrect and will make simulation framerate dependent, it will at least make the issue less visible for the end users.

VergilUa commented 1 month ago

While experimenting with different settings, I've stumbled upon Godot suggestions how to handle jitter / stutter which is platform dependent. E.g., see Window platform.

https://docs.godotengine.org/en/stable/tutorials/rendering/jitter_stutter.html

So, I've tried setting WindowMode::BorderlessFullscreen for the WindowPlugin and compared that to Windowed / Fullscreen.

BorderlessFullscreen provide much better experience of them all on Win 11.

There's also a weird input lag while using Windowed + 60hz setup on the 144hz monitor. Not sure if its related or not, but same behaviour is not seen when just stress loading game to <60 FPS.

Also, there's this suggestion, which kind of makes sense:

If your monitor supports it, consider enabling variable refresh rate (G-Sync/FreeSync) while leaving V-Sync enabled, then cap the framerate in the project settings to a slightly lower value than your monitor's maximum refresh rate as per this page. For example, on a 144 Hz monitor, you can set the project's framerate cap to 141. This may be counterintuitive at first, but capping the FPS below the maximum refresh rate range ensures that the OS never has to wait for vertical blanking to finish. This leads to similar input lag as V-Sync disabled with the same framerate cap (usually less than 1 ms greater), but without any tearing.

Bevy caps FPS higher than 144 for the 144hz G-Sync / FreeSync monitors, which may lead to unstable framerate spiral downwards and back again to something like 148. Not sure what causes this with PresentMode::AutoVsync though.

Unfortunately, there's no way to custom limit the FPS without workaround hacks like loading CPU with workload, so there's no way to test this idea properly.

aevyrie commented 1 month ago

For example, on a 144 Hz monitor, you can set the project's framerate cap to 141. This may be counterintuitive at first, but capping the FPS below the maximum refresh rate range ensures that the OS never has to wait for vertical blanking to finish. This leads to similar input lag as V-Sync disabled with the same framerate cap (usually less than 1 ms greater), but without any tearing.

This is effectively what bevy_framepace does when the framerate limit is set to auto. It finds the display refresh rate, and rounds down. This ensures that the CPU is never overproducing frames, and the queue length never exceeds 1. The downside of the approach is it will occasionally starve the queue, because you are slightly underproducing frames, but this will manifest as a tear or a duplicated frame.

The plugin works using spin_sleep to get better-than-OS sleep accuracy, without actually spinning and wasting CPU cycles - it can signal to the OS that the thread is in a spin loop, and it will use almost no power. This is really useful for both latency and power use reduction.