dimforge / bevy_rapier

Official Rapier plugin for the Bevy game engine.
https://rapier.rs
Apache License 2.0
1.22k stars 259 forks source link

Bug: Object never sleeping #508

Closed Vrixyz closed 3 months ago

Vrixyz commented 3 months ago

From a report on discord: https://discord.com/channels/691052431525675048/1239318969689837588/1239970243158216745

Exact reproduction is unknown, please comment if you reproduce

What I tried so far

More info which could be relevant:

From comment:

BaronVonScrub commented 3 months ago

Hey there, Discord reporter here.

I took my own project and cut it back to as minimal a reproduction as possible - which admittedly is still pretty bulky.

The transforms are being generated by the (seeded) RNG in my deserialization procgen. I tried just copying the problem transforms and instantiating directly, bypassing the system to recreate the problem scenario, but it didn't work.

So either I didn't have enough precision in the Inspector to recreate it (most likely), or it's emergent from the physics interactions in this specific setup.

Either way, I've just left the procgen in there to initialize the scene to a scenario I KNOW has these glitch cases. You get to see my nooby, uncommented code. Fun fun.

In case it wasn't apparent what the physics was for, I was making use of the contact forces to autocorrect any overlapping procgen parts to a valid position.

I've also got it printing out the transforms to the terminal each step, so the precisional flip-flopping of the transforms is easily apparent.

https://www.github.com/BaronVonScrub/Rapier-Bug-Swatting

The Rapier-relevant stuff is added in structure.rs at line 973

As for your questions: 1) Windows 10 Pro 2) 3D 3) There is no ground collider, I just have them translationally locked in the Y dimension (and all dimensions rotationally). 4) All colliders in the reproduction are CapsuleY. 5) The colliders are all on top-level entities. No parenting is involved.

As noted in the linked report, within my local fork, switching the last-frame/this-frame GlobalTransform equality check to an epsilon-supported comparison solved one identified issue, and the other was circumvented for my personal use-case by commenting out the line where the interpolated transform was inserted into the storage hashmap.

However, I've since found that with the main library, the issue is also circumvented with an overwrite of the config timestep_mode. I found that TimestepMode::Variable { 10.0, 10.0, 10 } works, but dropping any of these back to 1 is sufficient to recreate the issue.

Between the interpolation clue, the timestep clue, and the precision flip-flopping clue, I suspect that there is some conversion error going on within the interpolation logic, perhaps in the isometry conversion?

Hope this helps!

Vrixyz commented 3 months ago

Thanks for the thorough investigation, I'm confirming the bug! I could make a very minimal reproduction case here: https://github.com/Vrixyz/Rapier-Bug-Swatting/blob/main/examples/minimal.rs, that should help with tracking this down.

To understand what's going on, it's important to focus on the first frames after initializations:

on startup, bevy does a first pass of propagation, to initialize GlobalTransform, we can see there than we already lose precision:

[local] Body  rotation: [0, -0.33848378, 0, 0.94097227]
[Global] Body rotation: [0, 0, 0, 1]
propagating
[local] Body  rotation: [0, -0.33848378, 0, 0.94097227]
[Global] Body rotation: [0, -0.3384838, 0, 0.94097227]
logs done

Then, when we arrive in Update, we arrive in the same state:

[local] Body  rotation: [0, -0.33848378, 0, 0.94097227]
[Global] Body rotation: [0, -0.3384838, 0, 0.94097227]
rigidbody rotation from rapier: [0.0, -0.33848378, 0.0, 0.9409722]
[C:\Users\thier\Documents\pro\clients\foresight\projects\bevy_rapier\bevy_rapier3d\../src\plugin\systems.rs:659:54] interpolated_pos.rotation = Quat(
    0.0,
    -0.33848378,
    0.0,
    0.9409722,
)

I added a log on the rotation received from our RapierContext, What's interesting is than rapier has a different value, with a different precision loss. I think that explains the "flip flopping":

Now what's left to do is try to reduce again the reproduction with the specific types (I expect glam and nalgebra matrixes encoding being a bit different, I'm not sure yet. Stay tuned.

HumanBeanGames commented 3 months ago

Can confirm, #510 appears to fix it!