dimforge / parry

2D and 3D collision-detection library for Rust.
https://parry.rs
Apache License 2.0
567 stars 100 forks source link

Potential miscompile on Wasm / Chrome when Simd128 is enabled #74

Closed kettle11 closed 2 years ago

kettle11 commented 2 years ago

This is likely not a bug in Rapier but I wanted to open this issue for visibility in case anyone ever hits this.

I believe there's a miscompilation of some sort of likely happening in Chrome's WebAssembly backend that causes errors in Parry's BVH generation only when Simd128 is used.

Specifically this call to merge within qbvh.rs produces incorrect results: https://github.com/dimforge/parry/blob/167e0948d5f47107390e6d9c84f67095b6b42c3e/src/partitioning/qbvh.rs#L274

What I was seeing is this AABB:

AABB { mins: OPoint { coords: Matrix { data: [[0.0, -50.0, 0.0]] } }, maxs: OPoint { coords: Matrix { data: [[199.90764, 175.0, 197.43108]] } } } when merged with this AABB: AABB { mins: OPoint { coords: Matrix { data: [[-2824.0708, 46.78899, 948.4769]] } }, maxs: OPoint { coords: Matrix { data: [[-2822.8105, 47.78899, 949.73706]] } } }

would produce this faulty AABB: AABB { mins: OPoint { coords: Matrix { data: [[-2824.0708, -50.0, 0.0]] } }, maxs: OPoint { coords: Matrix { data: [[0.0, 175.0, 949.73706]] } } }

If I insert log statements that log to the browser console nearby then the code works correctly. Also if the browser console is opened then this code would work correctly until the page is refreshed.

This issue does not reproduce if I replace the call to my_aabb.merge(&aabbs[*id]); with this:

use crate::na::SimdPartialOrd;
let other = aabbs[*id];
my_aabb.mins = my_aabb.mins.coords.zip_map(&other.mins.coords, |a, b| {
    a.simd_min(b)
}).into();
my_aabb.maxs = my_aabb.maxs.coords.zip_map(&other.maxs.coords, |a, b| {
    a.simd_max(b)
}).into();

But does reproduce if I replace it with this:

 my_aabb.mins =  my_aabb.mins.coords.inf(&other.mins.coords).into();
 my_aabb.maxs =  my_aabb.maxs.coords.sup(&other.maxs.coords).into();

Based on how this occurs I'd assume this is a bug in Chrome's WebAssembly backend that occurs only for certain permutations of generated code. Perhaps when the console is opened Chrome runs more conservatievly to facilitate debugging so it does not trigger this behavior?

I may end the investigation here as I've already spent over a day on this, but given how absurdly hard to diagnose this is I wanted to open an issue in case anyone else ever sees something similar.


Potentially relevant: I've only tested this on an M1 Macbook Air.

kettle11 commented 2 years ago

I have created a repository with a minimal reproducable example: https://github.com/kettle11/min_repro_chrome_simd_bug/tree/main

kettle11 commented 2 years ago

I opened a Chrome bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1313647

sebcrozet commented 2 years ago

Thank you for investigating this! It is interesting to see that the issue happens even if you haven’t enabled any of the simd-related features (simd-stable or simd-nightly) of parry. So, could the auto-vectorisation performed by the Rust compiler be the one that messes up?

What happens if you try with the feature simd-nightly enabled (in which cases the SIMD instructions generated should depend on the what the packed_simd crate generates)?

kettle11 commented 2 years ago

What happens if you try with the feature simd-nightly enabled (in which cases the SIMD instructions generated should depend on the what the packed_simd crate generates)?

Good idea! I just checked and it still happens even if simd_nightly is enabled.

However in the process of testing that I noticed the issue only occurs after the first page refresh (if the dev console is closed). That was not the case in my larger project. Another weird detail!

kettle11 commented 2 years ago

Fixed in an upcoming Chrome release!

https://bugs.chromium.org/p/chromium/issues/detail?id=1313647#c22