jfalcou / eve

Expressive Vector Engine - SIMD in C++ Goes Brrrr
https://jfalcou.github.io/eve/
Boost Software License 1.0
949 stars 58 forks source link

[FEATURE] Webassembly SIMD support #982

Closed ruler501 closed 1 year ago

ruler501 commented 3 years ago

Is your feature request related to a problem? Please describe. I'm writing a library that is used both in-browser and as a standalone application. I'd love to be able to use SIMD to accelerate the computations.

Describe the solution you'd like A reasonably easy way to use simd on both x86 architectures and webassembly (through emscripten) that doesn't require too extensive platform specific code.

Describe alternatives you've considered I don't need simd that many places so I've considered trying to hand write the functions for webassembly and switch between them at compile time.

Additional context https://emscripten.org/docs/porting/simd.html is the documentation I've been referring to.

DenisYaroshevskiy commented 3 years ago

We were thinking about doing WebAsm but we were considering targeting proposed web assembly extensions. Not quite just yet though.

The doc you linked says:

Emscripten supports compiling existing codebases that use x86 SSE by passing the -msse directive to the compiler, and including the header <xmmintrin.h>.

Currently only the SSE1, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, and 128-bit AVX instruction sets are supported.

This approach with intrinsic conversion is a good idea and should hopefully tie you over.

On AVX we will by default select a 32 bit register, which means all sort of unpleasantness for you. I'd suggest sse4.2 for emscripten, we have a good support.

Example: eve::algo::inclusive_scan_inplace vs. std::inclusive_scan compiled for sse4.2

newplot (34)

I'd to start with maybe running all of the eve's tests

ninja -j 8 unit.exe && ctest -j 8 

for prod build please do not forget -DNDEBUG, we assert quite extensively.

DenisYaroshevskiy commented 3 years ago

Do not hesitate to reach out, we are happy for people to try out the library.

On top of creating issues, you can:

cpplang slack: jfalcou, dyaroshev email: joel.falcou@lri.fr, denis.yaroshevskij@gmail.com twitter: @CppSpelunker, @dyaroshev

ruler501 commented 3 years ago

Thanks for the quick reply. I'll see if I can get setup and benchmarked this weekend.

DenisYaroshevskiy commented 3 years ago

:+1: I personally have zero doubts that the first attempt will fail, please reach out.

jfalcou commented 3 years ago

Can't we consider wasm simd as a separate target architecture ?

DenisYaroshevskiy commented 3 years ago

Can't we consider wasm simd as a separate target architecture ?

Sure we can and should. Porting will take a long time though. But @ruler501 - wants to use eve now and the cross compilation should be a solid option.

DoDoENT commented 3 years ago

I did what you suggested (added -msse -msse2 -msse3 -msse4 to compile flags and included <xmmintrin.h> and now I'm getting following compile errors:

|| In file included from /Users/dodo/Work/Microblink/core-eve/jfalcou-eveTest/Source/EveTest.cpp:39:
|| In file included from /Users/dodo/Work/Microblink/core-eve/eve/include/eve/wide.hpp:18:
|| In file included from /Users/dodo/Work/Microblink/core-eve/eve/include/eve/arch/wide.hpp:10:
|| In file included from /Users/dodo/Work/Microblink/core-eve/eve/include/eve/arch/cpu/wide.hpp:10:
|| In file included from /Users/dodo/Work/Microblink/core-eve/eve/include/eve/arch/as_register.hpp:13:
eve/include/eve/arch/x86/as_register.hpp|53 col 65| error: use of undeclared identifier '__m256d'
||               if constexpr(std::is_same_v<Type,double> ) return __m256d{};
||                                                                 ^
eve/include/eve/arch/x86/as_register.hpp|53 col 72| error: expected ';' after return statement
||               if constexpr(std::is_same_v<Type,double> ) return __m256d{};
||                                                                        ^
||                                                                        ;
eve/include/eve/arch/x86/as_register.hpp|54 col 9| error: expected expression
||         else  if constexpr(std::is_same_v<Type,float > ) return __m256{};
||         ^
eve/include/eve/arch/x86/as_register.hpp|55 col 9| error: expected expression
||         else  if constexpr(std::is_integral_v<Type>    ) return __m256i{};
||         ^
eve/include/eve/arch/x86/as_register.hpp|71 col 65| error: use of undeclared identifier '__m512d'
||               if constexpr(std::is_same_v<Type,double> ) return __m512d{};
||                                                                 ^
eve/include/eve/arch/x86/as_register.hpp|71 col 72| error: expected ';' after return statement
||               if constexpr(std::is_same_v<Type,double> ) return __m512d{};
||                                                                        ^
||                                                                        ;
eve/include/eve/arch/x86/as_register.hpp|72 col 9| error: expected expression
||         else  if constexpr(std::is_same_v<Type,float > ) return __m512{};
||         ^
eve/include/eve/arch/x86/as_register.hpp|73 col 9| error: expected expression
||         else  if constexpr(std::is_integral_v<Type>    ) return __m512i{};
||         ^
eve/include/eve/arch/x86/as_register.hpp|87 col 54| error: unknown type name '__mmask8'
||     template<> struct inner_mask<8>   { using type = __mmask8;  };
||                                                      ^
eve/include/eve/arch/x86/as_register.hpp|88 col 54| error: unknown type name '__mmask16'
||     template<> struct inner_mask<16>  { using type = __mmask16; };
||                                                      ^
eve/include/eve/arch/x86/as_register.hpp|89 col 54| error: unknown type name '__mmask32'
||     template<> struct inner_mask<32>  { using type = __mmask32; };
||                                                      ^
eve/include/eve/arch/x86/as_register.hpp|90 col 54| error: unknown type name '__mmask64'
||     template<> struct inner_mask<64>  { using type = __mmask64; };
||                                                      ^

... and much more similar.

It appears that EVE is attempting to use 256-bit AVX. Is there a way to prevent that?

jfalcou commented 3 years ago

Normally, EVE detects and includes the SSE extensions by itself. You should not have to include ximmintrin manually as we'll do it for you.

Now, we use SPY to detect extensions so maybe I need to fix SPY for emscripten ?

DoDoENT commented 3 years ago

Possibly.

I've created a little test like this:

TEST( EveTest, supportsSIMD )
{
#if defined( __EMSCRIPTEN__ ) && !defined( __wasm_simd128__ )
    EXPECT_EQ( eve::current_api, spy::undefined_simd_ );
    EXPECT_FALSE( eve::supports_simd );
#else
    EXPECT_NE( eve::current_api, spy::undefined_simd_ );
    EXPECT_TRUE( eve::supports_simd );
#endif
}

It always fails on emscripten:

jfalcou commented 3 years ago

If no SIMD is detected, all operations are emulated indeed. Could you add a std::cout << eve::current_api << "\n" to your test to see which API is detected.

Alternatively, maybe we should first see if SPY (https://github.com/jfalcou/spy) works with emscripten by running the SPY test with it.

DoDoENT commented 3 years ago

If I build without -msse2, the eve::current_api prints Undefined SIMD instructions set. If I build with -msse2, but comment-out all usages of eve::wide, it prints X86 SSE2. If I don't comment-out the usages and includes of eve::wide, compilation fails with the error above.

DoDoENT commented 3 years ago

Btw, emscripten 2.0.31, which I'm using, is based on LLVM 14, but it still uses libc++ from LLVM 12, which doesn't provide support for concepts STL library.

I've worked around this by backporting required stuff from LLVM 13 (which is released and fully supports concepts STL lib).

If you need it for your testing, grab it here: llvm-stl-polyfill.zip

Just unzip it and put it somewhere in your include path.

Same trick is also needed when building with Android NDK r23 (also based on LLVM 12) and with Xcode 13 (iOS and macOS; also based on LLVM 12).

jfalcou commented 3 years ago

Yeah the libc++ support is still flaky. As for the issue, I'll need to test manually as it'll get faster than me telling you to poke here and there. What I don't understand is that we include that normally contains the definition of all x86 SIMD types so I quite don't get why it whines about the non SSE. Or it means the WASM SIMD files are not made to support post SSE types ?

DoDoENT commented 3 years ago

Or it means the WASM SIMD files are not made to support post SSE types ?

Yes.

From the Emscripten documentation:

Only the 128-bit wide instructions from AVX instruction set are available. 256-bit wide AVX instructions are not provided.

The xmmintrin.h that ships with Emscripten supports only 128-bit SSE and AVX types and emulates them into WASM SIMD instructions. This is the table that explains how efficient the emulation for certain intrinsic is (some of them run at native speed, some of them are emulated with different WASM SIMD instructions and some are scalarized).

Also, you can find the Emscripten's SSE emulation header here - it may also provide a good baseline for proper WASM SIMD support in EVE.

Emscripten also provides other emulation headers, for easier porting of ARM NEON code as well. You can see how it works here, but SSE and SSE2 emulation works best (most of the intrinsics are 1:1 with WASM SIMD).

DenisYaroshevskiy commented 3 years ago

Lack of intrinsics even defined is annoying. I dealt with this when doing a clang based refactoring at one point. My solution was to add using intr_name = int or smth like that.

But it's nasty for real use-case.

DenisYaroshevskiy commented 3 years ago

You can try putting it here: https://github.com/jfalcou/eve/blob/26d50d59190d3e2817c3ca95614e3e74f6c6f30a/include/eve/arch/x86/predef.hpp#L61

Should be similar to this for arm: https://github.com/jfalcou/eve/blob/26d50d59190d3e2817c3ca95614e3e74f6c6f30a/include/eve/arch/arm/predef.hpp#L19

THis is obviously a hack so I can't tell you if it will work or not.

jfalcou commented 3 years ago

Maybe we can actually modify spec.hpp for x86 by wrapping the >128 test in a macro that detects emscipten properly ?

DenisYaroshevskiy commented 3 years ago

My opinion:

We should try to hack x86 to unblock @DoDoENT If that takes too much code changes and is too hard, we should just hook up wasm as a proper target.

The initial commit for that is big but not that big. It will go through emulation most of the time, which will be bad (that's why I suggest hacking x86 at the moment) but we can fix the gaps as we go along.

DoDoENT commented 3 years ago

@DenisYaroshevskiy , this appears to be working for me at the moment. It compiles and runs and actually uses WASM SIMD (tested on Browser that doesn't support SIMD (Safari) and it fails to run the binary, as expected, and it works correctly on Chrome).

I'll make some more tests on my side (I'm still learning the API, the documentation is pretty scarce - I'm basically doing everything based on this example) and add more stubs if needed. Afterward, I can create a PR.

DenisYaroshevskiy commented 3 years ago

The fact that you have to stab functions as well as registers is non sustainable.

I will try to have a look at actually hooking up the WASM implementation proper this weekend. Unfortunately @jfalcou knows that stuff much better than I do but he's busy with other things. This will not give a complete eve solution but it will work and be converted as we go along.

What I think can work in a mean time is, instead of writing the wrappers yourself, you can try hooking up Simd Everywhere, which has WASM <=> x86 emulation complete.

https://github.com/simd-everywhere/simde

Defining SIMDE_ENABLE_NATIVE_ALIASES + including x86 headers probably is how it's done.

DenisYaroshevskiy commented 3 years ago

With respect to documentation being scarce, yeah - feel free to report an issue or ask on C++ lang Slack "@jfalcou" or "@dyaroshev" - we can sketch you an example.

DenisYaroshevskiy commented 3 years ago

Given that @DoDoENT discovered eve doesn't work for them - will push this back a bit

DenisYaroshevskiy commented 2 years ago

Given my schedule in the upcoming month at least I don't see this happening soon.

jfalcou commented 2 years ago

We need to get a proper testing harness first anyway