bjornbytes / lovr

Lua Virtual Reality Framework
https://lovr.org
MIT License
1.99k stars 138 forks source link

ODE crashes when colliders are millions of meters away #498

Closed ottworks closed 2 years ago

ottworks commented 3 years ago

This happens pretty much at random for me currently, but looking at the error line I think a collider is flying way off out of bounds and crashing the game. Perhaps delete colliders that go out of bounds instead? Or constrain the whole simulation.

bjornbytes commented 3 years ago

It seems like it would be kinda expensive to do an out of bounds test for all the colliders on every update. Hmm

ottworks commented 3 years ago

It seems like it would be kinda expensive to do an out of bounds test for all the colliders on every update. Hmm

Can it be deleted in the same place this error is thrown?

bjornbytes commented 3 years ago

ODE doesn't provide a way to catch errors like this as far as I can tell. World:update / World:computeOverlaps / World:raycast would need to loop over all shapes of all bodies and disable any shapes that have a position in the billions. We couldn't stagger the work (like check 1 body per update or something) because then the error could still happen. It may just be better to do the check in Lua for the bodies that you know might have this issue, which isn't really a great solution but avoids adding this overhead to every physics world.

jmiskovic commented 3 years ago

I don't understand the circumstances under which this happens. I tried reproducing it, by putting few colliders really far out and by setting up strong gravity so they quickly become as distant as possible. Nothing triggered it.

What happens when this assertion fails? Does the sim continue or does the lovr crash?

ottworks commented 3 years ago

Lovr crashes. My colliders were applied with huge forces.

jmiskovic commented 3 years ago

I reproduced it, comes down to having at least two colliders and having coordinates with value outside the integer range.

world = lovr.physics.newWorld()
c1 = world:newBoxCollider(1e8, 0, 0)
c2 = world:newBoxCollider(1e8, 0, 0)
world:update(1)

I really don't like when VR app crashes, it should degrade gracefully if there are errors. ODE on the other hand has other ideas and doesn't refrain from segfaulting if it detects anomalies in simulation.

I already tried to patch this before and it seems I missed the crash-on-purpose *(int *)0 = 0; inside dCheck macro definition. If this is removed, the physics will not crash the app and in this case the only consequence is that collision detection won't work properly for colliders while they are out of bounds. Some other cases might be more serious but I don't think I ever had dCheck crash on me so far.

Modifying ODE is a bit of maintenance problem as it makes it harder to merge in their upstream changes, but in this case it makes more sense than other options.

Deleting collider without user's involvement would be wrong I think. For example, if some colliders are used as part of special effects, maybe they will drift off out of bounds but user will expect to re-use them later for another explosion or other effect.

What do you think?

ottworks commented 3 years ago

Solid reasoning. Do they behave again when brought back in bounds?

jmiskovic commented 3 years ago

Yes. Even out of bounds they are correctly influenced by gravity and other forces, just collision detection misbehaves because AABB bounds use integers and the conversion results in undefined behavior. Once brought back into valid bounds the collider resumes colliding happily. With this change the out-of-bounds would no longer crash, but you would still get the assertion message in lovr's log output.

bjornbytes commented 3 years ago

Aside from the threading pool, dICHECK is only used in 4 places.

Most asserts in ODE are guarded by dNODEBUG but dICHECK isn't.

Maybe we should revert the change on the lovr fork that removed the abort/exit, protect dICHECK by dNODEBUG, and start applying the dNODEBUG define in release builds. This way debug builds of lovr will get the asserts and crashes, but release builds won't have those checks at all.

If people feel the debug messages are still useful even in release builds, then we could remove the *(int*)0 = 0 as suggested. However it may still be a good idea to apply dNODEBUG in unchecked/supercharge builds.

bjornbytes commented 2 years ago

Actually, dICHECK was specifically added to be enabled in release builds for errors that must be handled. So it is probably best to keep its behavior.

Instead, I made a change to ODE to ignore colliders that are out of bounds instead of crashing. Tested on several normal physics examples as well as the repro above (and added that to the shiny new tests folder I am experimenting with!) and it looks good.