mattdesl / mp4-wasm-encoder

https://mattdesl.github.io/mp4-wasm-encoder/
MIT License
73 stars 2 forks source link

Compiling code with NEON intrinsics to Wasm SIMD #2

Open ngzhian opened 3 years ago

ngzhian commented 3 years ago

Ensure that WASM version of minih264 library is indeed taking advantage of SIMD (lots of NEON code that doesn't compile there)

Really cool work :)

I'm wondering if you tried https://emscripten.org/docs/porting/simd.html#compiling-simd-code-targeting-arm-neon-instruction-set. @seanptmaher did some work in SIMDe + Emscripten (cc @tlively), which transparently supports a subset of ARM NEON intrinsics. This might get you over some of the compilation errors, and could help with performance too.

Also, I work on WebAssembly SIMD in V8, if you have questions or bug reports, please file an issue on me (zhin@chromium.org) at https://bugs.chromium.org/p/v8/issues/entry. Thanks!

mattdesl commented 3 years ago

Hey, thanks! I'd be happy to get more SIMD working here. I've tried the instructions there and many of the NEON intrinsics seem to compile OK, but the compiler errors out on some specific functions, with undeclared identifier errors:

vabdq_u8
vhsubq_u8
vqsubq_u8
vhaddq_u8
... and more

My code is now here for the NEON SIMD branch:

https://github.com/mattdesl/mp4-h264/tree/simd

seanptmaher commented 3 years ago

Ah, yeah, there are still a few instructions which aren't implemented in SIMDe's NEON implementation, so if you want to use them, you'd have to either PR SIMDe, or refactor the code to not use them, sorry about that.

As far as performance goes, if you're using NEON instructions which don't have any near-analogue in Wasm, you might run into slowdowns, as they'll get scalarized. (However, emscripten is quite good at adding in shuffles to emulate instructions that at first glance might not seem doable).