emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.37k stars 3.25k forks source link

256-bit AVX intrinsics support #21684

Open jiepan-intel opened 3 months ago

jiepan-intel commented 3 months ago

Compile existing x86 SSE/AVX SIMD code into WASM SIMD is very attractive, developer can reuse existing library without rewrite it. However currently only 128-bit subset of the AVX intrinsics are supported, many existing code cannot meet this restriction. Adding 256-bit AVX intrinsics support will expand the applicable scenarios and may also increase performance. Does emscripten have a plan for this?

Currently Google Highway supports WASM_EMU256 (a 2x unrolled version of wasm128) target, A re-vectorize optimization phase is being developed in Google V8 JS engine, which can pack two SIMD128 nodes into one SIMD256 node.

Sample code for AVX intrinsics support:

typedef struct Vec256 {
  __m128 v0;
  __m128 v1;
}__m256;

static __inline__ __m256 __attribute__((__always_inline__, __nodebug__))
_mm256_add_ps(__m256 __a, __m256 __b) {
    __m256 c;
    c.v0 = (__m128)wasm_f32x4_add((v128_t)__a.v0, (v128_t)__b.v0);
    c.v1 = (__m128)wasm_f32x4_add((v128_t)__a.v1, (v128_t)__b.v1);
    return c;
}
tlively commented 3 months ago

Given our precedent for providing emulation of SSE and Neon intrinsics, it seems reasonable to provide emulation for AVX intrinsics as well. We don't have any work planned for this, but contributions would be welcome.