bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust
https://bevyengine.org
Apache License 2.0
35.58k stars 3.52k forks source link

Support interpolating values from fixed-timestep systems #1259

Open fenduru opened 3 years ago

fenduru commented 3 years ago

What problem does this solve or what need does it fill?

When the timestep between a PhysicsSystem and RenderSystem differs (i.e. because the PhysicsSystem runs on a fixed timestep), and RenderSystem depends on the calculations done by PhysicsSystem, it is desirable to interpolate the calculated values depending on how different the timesteps were in reality to avoid hitching. As is, there are a couple of issues that prevent an obvious solution:

  1. FixedTimestep systems wait for the entire timestep to elapse before running the system. This means that if rendering runs at a higher framerate, the renderer will only be able to render the last physics-frame rather than being able to interpolate between the last physics frame and the next physics frame. For example, if we render at 200fps (5ms frame time) and do physics at 60fps (16ms frame time), then after a physics tick we'll render the same frame 3 times (because only 15ms has elapsed, we won't run the next physics tick until the 4th frame).
  2. The Transform component is a bit overloaded. While it is nice to have a unified component representing the transform of an entity, once we start trying to do interpolation there is a difference between the "physical transform" and the "interpolated transform". Rendering related systems should likely use the interpolated transform, where as other systems may/may not want to use the physical transform. We also must keep the previous and current physical transforms (in addition to the interpolated transform that gets rendered - unless the renderer computes the interpolation on the fly) so that when the physics system runs it is doing calculations off the current position, and interpolation is always calculated between the same values. Without this then the result of physics simulation becomes dependent on the rendering timestep, as the same physics-step might get interpolated differently. Synchronizing these two Transforms can get tricky (e.g. how does PhysicsSystem tell the difference between a Transform that changed due to interpolation vs. a game system teleporting that object?)

What solution would you like?

  1. FixedTimestep systems should run once at t=0. In the above example, we would run one physics tick (putting the physics simulation 16ms into the future), and then for the next 3 frames we'll get overstep_percentage of 5/16, 10/16, and 15/16 (making interpolation really simple).

  2. Not sure what the best solution would be. It might be simplest for physics plugins to copy Transform into a struct PhysicalTransform { current: Transform, previous: Transform } that it uses and updates during simulation, and then in a non-fixed-timestep do the interpolation and write it out to Transform for rendering. It would have to initialize these by copying them in, and then also overwrite it when it detects a change to Transform from an earlier system (in the teleport example). This has the downside of game systems potentially reading the Transform and acting on interpolated values that aren't actually canon according to the physics simulation - I think it is arguable whether this is a good thing or a bad thing.

What alternative(s) have you considered?

The idea of having future values could be built into the rendering system and it could handle the interpolation. For instance the physics system could add/update a component Next<Transform>, with the Next type encapsulating the "percentage". Rendering systems would need to understand interpolation (and potentially have various/configurable interpolation strategies), and query for something like Query<(&Transform, Optional<&Next<Transform>>) and conditionally interpolate. If this becomes a common pattern then there is some opportunity for syntactic sugar, but that's getting ahead of ourselves.

This approach would ultimately encapsulate essentially the same information as above, but with the renderer being responsible for interpolation rather than a PhysicsPlugin. This seems like a better separation of concerns to me, as interpolation is done primarily to avoid graphical hitching - the physics simulation doesn't actually need it - but at the cost of introducing more opinions to core systems (not sure the maintainers' stance on if that's a pro or a con). Also it may just be unavoidable for the physics system to have to know about the interpolation since it is the one with an overstep_percentage that needs to be written somewhere.

Other notes

Not entirely related to interpolation, but some other pain points with fixed timesteps:

jpetkau commented 3 years ago

Systems have to be aware of whether or not they're running in a fixed timestep because they need to choose between Res<Time> and Res<FixedTimesteps>. This seems to go against the DI nature of Bevy - my integrate_velocities function doesn't care whether the timestep is fixed or not it just needs to know the delta

That can be handled by having integrate_velocities always use (say) the PhysicsTime (or SimulationTime or UpdateTime or whatever you want to call it), which might be updated at a fixed or variable rate. Likewise rendering would always use e.g. RenderTimestep. (Which allows for other subtleties, e.g. if you have an estimate of the true frame presentation time, the render time might be in the future, relative to the arrival time of UI events.)

IOW instead of Res<FixedTime> and Res<VariableTime>, there should be Res<RenderTime> and Res<UpdateTime>.

Or some variation on that theme, like a single Res<Time> with explicit time_of(LastUpdate) / time_of(NextUpdate) / time_of(LastFrame) / time_of(NextFrame) etc. calls. The point is that could shouldn't be written to the fixed vs. variable clock, but rather the update vs. render clock.

wlott commented 3 years ago

This turned out to be quite a long post, so I'll summarize first:


I think the complications described here only just start to scratch the surface. Many components beyond Transform have the potential to need a render variant and a sim variant. Any system that needs to see/touch any of those components needs to correctly pick the right version to use. Bugs in that choice will only show up when someone tries to use that system in a setup where the two times have the potential to diverge noticeable. I'm working on a side-project using Bevy with a friend (jpetkau of the prior comment, btw). My setup runs locked to vsyncs at 16.67ms/frame. He has a fancy variable-rate monitor or something and runs between 1 and 3ms/frame depending on CPU demands. I just don't see any of the bugs he is experiencing, so if I were to introduce a new one, I wouldn't experience it. Correctness would purely be a function of discipline--and relying on programmer discipline isn't really a successful strategy.

And even if you hypothesize bug-free code, there is a significant implementation challenge with having variable CPU work in your render loop. If some iterations through the main loop require a sim/physics run and some do not, the extra time to compute that physics run can cause noticeable stuttering in the frame rate. You can theoretically compensate by trying to interpolate to a predicted when-will-this-frame-actually-show-up time like Jeff alluded to above, but inaccuracies in that prediction could still show up as jerkiness in the animations.

Supreme Commander had exactly all these problems mid-way through its development. It ran the simulation at 10Hz and rendered at whatever the player's setup could run. And as soon as the time to do a simulation tick climbed past a fraction of the time to do a render, the game visually stuttered at the sim's 10Hz cadence. Render-render-render-hitch-render-render-render-hitch.

And the code was absolutely rife bugs where people used values from the wrong time domain for the context in question. UI code would end up hit-testing against the physics location of a unit instead of the rendered location, making it difficult to select fast moving units. Simulation code would randomly mix state from the UI (i.e. render time domain) with state from the sim. It was a mess.

The solution they (okay, I'll drop being coy--it was me) came up with was to completely split the sim/physics domain from the render domain into two completely separate data structures. In the parlance of Bevy, the best analogy would be two separate Worlds. The sim ran in its own thread completely decoupled from rendering. (This was early 2000s--multiple threads was cutting edge. I had bought me an AMD Athlon 64 x2 and I wanted to leverage that second core!) When the sim finished a tick, it would queue up the computed state needed by the renderer/UI. When the render thread interpolated past the "current" tick, it would apply the next batch of sim state. So there were actually three variants of game state: the live sim state, the end-of-tick snapshots, and the render/UI state. The snapshots let the sim start the next update before the render thread applied the previous.

This worked. We shipped the game with a deterministic sim/physics update (well, mostly deterministic as anyone who had a multiplayer game "desync" can attest). Upsides:

But it wasn't perfect:

Years later, I had the privilege to try again at a different company. Planetary Annihilation attempted to address some of those shortcomings by adding more magic and cleverness. Forrest Smith did a great job describing the whole system in this blog post. The short version in Bevy terms is that we built a distributed ECS where every "component" was backed by an animation key-frame history instead of a single current value. The simulation would grow history by appending to these curves. The render/UI stuff sampled the curves at a smoothly interpolating playback position. As a side-effect, this gave us replay for free: just move that interpolation point back to the beginning of time. And save/load: just store those history curves.

The original idea was that the sim and UI would both directly access this shared history database. In practice, we didn't end up doing exactly that. Instead, the sim kept a lot of state that wasn't part of the history/ECS system and we only moved stuff into the history system that we needed for UI or replay/save/load purposes. And the UI/renderer ended up copying lots of state from the history system into traditional datastructures instead of directly accessing the history system as I had originally envisioned. Also, I slipped that "distributed" word in there. We didn't have a headless variant of the game, we only had headless. The sim ran in its own process as part of a headless server even for single player games. The client program contained the renderer and the UI and could spawn a background server process whenever needed. Which meant that the sim and UI were not really sharing the same database, they each had their own copies and the networking code incrementally copied state from the server to the client. They were the same in the notion of "same implementation" but not actually physically the same bits in memory.

(Given that it was a RTS that we expected to be played primarily multiplayer, that seemed like a reasonable choice. Add single player to a multiplayer game by expanding "multi" to include the n=1 case instead of having multiplayer be an afterthought shoehorned into a singleplayer architecture.)

This worked quite well for our domain, but it still wasn't perfect. It was a significant engineering effort. It did solve some problems by establishing a decisive notion of time and how properties change over time. But the complexity was significant and it is hard to see how to justify it unless your aspirations are as over-the-top as ours where for PA.

And it still had the same extra latency along the UI event -> simulation -> UI feedback path that Supreme Commander had. It was acceptable for PA because it was an RTS, but that could easily be a real problem for more latency sensitive genres like FPS or rhythm games.

Looking at Bevy specifically, I can't recommend the PA model despite my love of it. It is significant complexity that isn't needed and/or is incompatible with non-RTS style games. It also didn't extend beyond the domain of "game simulation state" in the same way Bevy is trying to leverage its ECS stuff (for example, using ECS for the front-end UI elements).

But I think the "multiple worlds" notion from Supreme Commander could have potential. I think a brute-force version of that could be implemented now--just instantiate a regular world for UI and a stripped down headless world for the simulation. Each world would have an independent set of entities and systems and would have its own update loop. Data would have to be explicitly copied from one world to the other world--probably via something that would look alot like an in-memory save in the sim followed by an load-from-memory restore in the render/UI world.

Fancier (i.e. less brute-force) would be to allow some kinda of controlled sharing between the two worlds. Obvious low-hanging fruit would be things like having them both share the same asset loader or other helper "resources" that are meaningful to both worlds.

Another aspect that could benefit from sharing would be to make the two worlds use the same Entity ids. If issuing and recycling ids was coordinated between the two worlds, the copy data step wouldn't need to maintain a mapping from sim-side-id to/from ui-side-id.

The total cake-with-icing-and-sprinkles would be some kind of way to share components between the two worlds. Some components would only be relevant for simulation, some components only relevant for render/UI, but for components that are relevant to both having that component accessible from either would be highly convenient. And would basically be an automatic implementation of the "copy" step. Whether it was implemented with interlocking and tight coordination or with buffering is a design trade-space that would need some exploration.

Or another way of framing the exact same work would be to allow parallel execution of (mostly) independent schedules. Instead of calling it separate worlds with controlled sharing, call it one world with multiple scheduler threads. Same thing really, but one framing vs the other may work better with the existing Bevy conventions/expectations.

sojuz151 commented 3 years ago

I am not a rust or gamedev expert but I would suggest the following:

pnarimani commented 11 months ago

Not sure what the best solution would be. It might be simplest for physics plugins to copy Transform into a struct PhysicalTransform { current: Transform, previous: Transform } that it uses and updates during simulation, and then in a non-fixed-timestep do the interpolation and write it out to Transform for rendering. It would have to initialize these by copying them in, and then also overwrite it when it detects a change to Transform from an earlier system (in the teleport example). This has the downside of game systems potentially reading the Transform and acting on interpolated values that aren't actually canon according to the physics simulation - I think it is arguable whether this is a good thing or a bad thing.

I think this is the best solution to go.

With any other solution, we are gonna be over-engineering. Keep in mind that Bevy is supposed to be a general purpose engine. We shouldn't make a huge cumbersome solution that addresses 1% of the games. I think bevy is modular enough that those specific use cases can make their own solutions.

For an average game, having a PhysicsTransform that holds the true physical position of the entity is enough. Then the physics engine can interpolate (or not interpolate and directly assign) Transform based on a settings that you specify on that specific Rigidbody.

Edit: To summerize:

jordanhalase commented 5 months ago

There are many ways to interpolate between transforms. The simplest would be to use linear interpolation, but what if the object in question is accelerating? Someone may want to interpolate using quadratics for such entities to avoid linear jagged motion. If the physics system is calculating a fast-moving circular path, they may want to use circular interpolation. What if it is spinning? or if its angular velocity is changing? Bézier curves?

Perhaps linear interpolation alone with a fast enough timestep may be good enough to be unnoticeable, but not every type of game needs nor can afford a fast timestep.

For these reasons I am leaning more toward solution (2) in the original post. With this solution, I am surprised at how easy it can be in Bevy to support linear interpolation by simply using a fixed timer in the regular Update schedule.

/// Perform linear interpolation from old position to new position (runs in Update)
fn interpolate_system(
    mut query: Query<(&PositionOld, &Position, &mut Transform)>,
    time: Res<Time<Fixed>>,
) {
    let (position_old, position, mut transform) = query.single_mut();

    let delta = position.0 - position_old.0;
    let lerped: Vec2 = position_old.0 + delta * time.overstep_fraction();

    transform.translation = lerped.extend(0.0);
}

Full gist (Feel free to use)

There is one potential performance issue in that Time::<Fixed>::overstep_fraction() performs a division when called, which may be called many times in the Update stage, which should probably be optimized into a reciprocal multiply. (I may be nitpicking here.)

The real performance issue could be in the step copying position to position_old for every entity with these components. With more complicated physics data this can grow quite large. It would be interesting to benchmark a system that could somehow do this without copying state, instead simply keeping track of which is new and which is old, and swapping references to each behind the scenes. If anything, maybe Bevy could support this as a feature.

Nonetheless I think I am okay with this as a solution. It does not particularly feel like I am working "around" any supposed missing features in Bevy to do this, and in my mind feels the most "natural" way to do this; but I cannot say the same for the current physics engines for Bevy. Physics engines like Rapier should compute on transform structures that are not Transform to separate the render transform from the physics transform, at least as an option or feature. Then the game developer can interpolate the values themselves (or have the physics engine itself do it/third party plugin). (Rapier seems to do this via TimestepMode::Interpolated and TransformInterpolation, but it lacks examples and I'm not fluent with it.)

There is also the issue of parenting, of course...

alice-i-cecile commented 4 months ago

I've laid out a plan to tackle this in #13773 :)

sojuz151 commented 4 months ago

I would like to point out that the position can not be interpreted correctly based only on initial and final position because velocity might not be continuous. For example, a ball bouncing or a missile hitting shield. There, the physics system should return not just a single position but also a list of positions with timestamps.

This is important, and for example, jordanhalase suggestion doesn't support this.

I would say that physics should return generic movement information and custom interpolation should consume this

Jondolf commented 1 month ago

Throwing this out here as one potential approach, specifically for general-purpose Transform interpolation that doesn't require anything from physics engines or practically any other libraries. Note that this is a work-in-progress, and I'm not suggesting we upstream this, at least in its current form.

A few days ago I made bevy_transform_interpolation, which interpolates changes made to Transform in the fixed timestep schedules. It essentially maintains start and end values for the interpolation of different Transform properties, and updates them in a way that allows a lot of flexibility and makes it possible to use Transform directly for (almost) everything.

Some pros:

Some current caveats:

The main thing I find nice about my approach is that it is just a drop-in solution that requires basically no changes from the user or 3rd party crates. It just works with Transform from what I've seen and heard from people who have tried it so far, although it's a bit early to say for sure.

Almost all proposals I have seen so far focus on the idea of explicitly separating "render Transform" and "GameplayTransform/PhysicsTransform", and while this is needed internally for the interpolation, I feel like it should be possible to avoid an API-level conceptual split. I don't remember having to use multiple different types of transforms in Unity or Godot, for example, although that approach probably would have some benefits as well.

I would like to point out that the position can not be interpreted correctly based only on initial and final position because velocity might not be continuous. For example, a ball bouncing or a missile hitting shield. There, the physics system should return not just a single position but also a list of positions with timestamps.

This is kind of correct, although I'm not sure if I agree on the solution. A physics engine (impulse-based) generally only changes the positions of objects multiple times within a single timestep if the solver has substepping, which arguably most engines do have nowadays (like Rapier and Avian). However, simplifying a few details, this is kind of equivalent to just subdividing the timestep into multiple smaller timesteps, which is similar to just increasing the fixed timestep tick rate (again, glossing over details). It still doesn't make the interpolation truly accurate, especially if you use a small number of substeps.

I don't feel like returning a list of positions would be particularly useful or efficient for most applications, especially if they already have a high enough tick rate where the extra interpolation steps wouldn't have visible benefit. I would be curious if you know existing engines that use this approach, though.

Something that might be useful is using the velocity data from the previous and current tick, and using something like Hermite interpolation to more accurately approximate the motion of the object along that curve. I'm not entirely sure how well it would deal with sudden accelerations or decelerations that change the object trajectory drastically, but I think it should help in most cases. That's something I'll have to try out.

Edit: Realized that angular velocity also needs to be taken into account at high rotation speeds to avoid artifacts. There are two (or more in 3D) ways to reach the same rotation, and the "naive" slerp just takes the shortest path.

tbillington commented 1 month ago

For prior art Godot just released 2d physics interpolation and is working on 3d.

janhohenheim commented 4 weeks ago

Also adding my own take to the pile of potential solutions. See its readme for differences to @Jondolf's plugin. I believe the scheduling used in my plugin is closer to what I'd like to see upstream, as it mimics the approach taken in the official fixed timestep example. Disclaimer: obviously I'm gonna say that; I wrote that example and added the RunFixedMainLoopSystem convenience API to Bevy, so I'm hugely biased :D I got to this approach by a lot of discussion with others though, so I believe it somewhat represents a current consensus among some Bevy users.

Regarding the question of whether we need to avoid a split between "render transform" and "gameplay transform", I think an upstreamed interpolation would allow a user to mostly disregard the render transform and just work with the gameplay transform. Maybe it would even make sense to leave Transform as the gameplay transform so user code doesn't break and just introduce a new RenderTransform component?