emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.68k stars 3.29k forks source link

emsdk/upstream/emscripten/cache/sysroot/include/compat/emmintrin.h is not lib/clang/19/include/**/emmintrin.h #22226

Open gblikas opened 2 months ago

gblikas commented 2 months ago

I'm looking to narrow down a build issue. I am compiling opencv4 via

vcpkg install opencv4:wasm32-emscripten --editable

and have modified a bad header to #include <emmintrin.h> as it is a missed dependency. I did this because the documentation for Using SIMD with WebAssembly indicates that _mm_setr_epi64 should be available via #include <emmintrin.h>. Judging by my error's output:

/Users/gblikas/projects/vcpkg/buildtrees/opencv4/src/4.8.0-2bf495557d/modules/core/include/opencv2/core/hal/intrin_sse.hpp:252:15: error: use of undeclared identifier '_mm_setr_epi64'; did you mean '_mm_set_epi64'?
  252 |         val = _mm_setr_epi64((__m64)v0, (__m64)v1);
      |               ^~~~~~~~~~~~~~
      |               _mm_set_epi64
/Users/gblikas/projects/emsdk/upstream/emscripten/cache/sysroot/include/compat/emmintrin.h:1133:1: note: '_mm_set_epi64' declared here
 1133 | _mm_set_epi64(long long q1, long long q0)
      | ^

and

$ cat /Users/gblikas/projects/emsdk/upstream/emscripten/cache/sysroot/include/compat/emmintrin.h  | grep "_setr"
_mm_setr_pd(double __c0, double __c1)
_mm_setr_epi32(int i0, int i1, int i2, int i3)
_mm_setr_epi16(short w0, short w1, short w2, short w3, short w4, short w5, short w6, short w7)
_mm_setr_epi8(char b0, char b1, char b2, char b3, char b4, char b5, char b6, char b7, char b8, char b9, char b10, char b11, char b12, char b13, char b14, char b15)

it appears as if the included header emmintrin.h from **/upstream/emscripten/cache/sysroot/include/compat/emmintrin.h is not the same as any of the lib/clang/19/include/**/emmintrin.h headers.

I'm not sure what emscripten/cache/sysroot is, so this issue is most-likley user error. However,

$ grep -lR "_mm_setr_epi64" emsdk
emsdk/upstream/emscripten/test/sse/test_sse2.cpp
emsdk/upstream/emscripten/test/sse/benchmark_sse2.cpp
emsdk/upstream/lib/clang/19/include/emmintrin.h
emsdk/upstream/lib/clang/19/include/ppc_wrappers/emmintrin.h

does show that _mm_setr_epi64 should be found in emsdk/upstream/lib/clang/19/include/**/emmintrin.h. Is there a known work around for this?ng the

If this issue is better suited for the opencv repository, or vcpkg, I'll move the issue over there. However I believe this is an #include issue with emscripten-core.


Potentially Related issue

Version Info

$ emcc -v 
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.62 (34c1aa36052b1882058f22aa1916437ba0872690)
clang version 19.0.0git (https:/github.com/llvm/llvm-project c00ada070207979f092be9046a02fcfff8b9f9ce)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /Users/gblikas/projects/emsdk/upstream/bin
sbc100 commented 2 months ago

The compat/emmintrin.h in the emscripten sysroot looks like it uses wasm intrinsics which is likely better than the clang builtin one that is using __builtin_ia32 intrinsics.

I'm guess we simply have not updated it in a while and we need to add the missing _mm_setr_epi64 declaration.

@tlively do you know about the history of these two files?

sbc100 commented 2 months ago

It looks like the old version of this header (which was removed in #7924) did contain this function but when it was re-introduced in #11243 that function was not included.

CC @juj

gblikas commented 2 months ago

The compat/emmintrin.h in the emscripten sysroot looks like it uses wasm intrinsics which is likely better than the clang builtin one that is using __builtin_ia32 intrinsics.

I'm guess we simply have not updated it in a while and we need to add the missing _mm_setr_epi64 declaration.

@tlively do you know about the history of these two files?

@sbc100 Thanks for the prompt response. If it's not too involved, I can push a PR.

gblikas commented 2 months ago

It looks like the old version of this header (which was removed in #7924) did contain this function but when it was re-introduced in #11243 that function was not included.

CC @juj

@sbc100 So, I got a working solution. I'm now formalizing the patch in emscripten-core, and will post a PR we can edit/tune in, in a moment.

Additionally, I'm looking at the definitions, https://github.com/emscripten-core/emscripten/blob/c8904662caf1c7e43e29f5881f6ce615360f9c68/system/include/compat/emmintrin.h#L1180-L1196 and wondering why they are preferring builtin types: char, short, int, long, long long. It seems to be that it's better to transition them to their int8_t, int16_t, int32_t, int64_t, and __m64 specific types, in order to provide more portable and readable code. @tlively or @juj do you have any suggestions on this?

I understand that wasm_simd128.h typedef's the builtin types, and perhaps the reason for leaving

static __inline__ __m128i __attribute__((__always_inline__, __nodebug__))
_mm_setr_epi64(__m64 q0, __m64 q1)

out was the fact that wasm_simd128.h doesn't actually contain a direct typedef for __m64?

tlively commented 2 months ago

I would be fine with using the intX_t types.

__m64 is apparently defined in xmmintrin.h, so if it's missing, that would be the place to add it, not wasm_simd128.h.

gblikas commented 2 months ago

@tlively or @sbc100 do you have an idea why the conditional in https://github.com/emscripten-core/emscripten/pull/7924 (was the PR it was removed):

#ifndef __EMSCRIPTEN__ // MMX support is not available in Emscripten/SIMD.js.
static __inline__ __m128i __attribute__((__always_inline__, __nodebug__))
_mm_setr_epi64(__m64 q0, __m64 q1)
{
  return (__m128i){ (long long)q0, (long long)q1 };
}
#endif

was based on the __EMSCRIPTEN__ macro instead of something MMX/SEE specific; what's the history here wrt the comment?

tlively commented 2 months ago

These headers are only ever included in Emscripten builds, so #ifndef __EMSCRIPTEN__ is functionally equivalent to #if 0. I guess this function was removed from the build because it could not be efficiently supported on top of SIMD.js back in the day, but I don't know why exactly.

gblikas commented 2 months ago

@tlively https://github.com/emscripten-core/emscripten/pull/22243 should be good then.

juj commented 2 months ago

It looks like the old version of this header (which was removed in #7924) did contain this function but when it was re-introduced in #11243 that function was not included.

CC @juj

Originally when I wrote 128-bit SSE,SSE2, ... support, I did not add any of the old legacy 64-bit MMX support, which is considered obsolete: https://www.phoronix.com/news/LLVM-Clang-MMX-Intrinsics-SSE2 and https://groups.google.com/g/llvm-dev/c/7FEDyVe2ipQ/m/PY4F6hElAgAJ?pli=1

The reason was that Wasm SIMD128 did not support any 64-bit intrinsics.

Note that there is a little bit of overlap with 128-bit SSE and 64-bit MMX, as the SSE instruction sets also added (for completeness) some 64-bit MMX registers instructions in them: https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=5740&techs=SSE_ALL&text=__m64

so as result, not "faking" 64-bit intrinsics resulted in those SSE/SSE2/SSE3 functions being dropped out that had a __m64 in their signature.

I don't oppose adding support to 64-bit MMX functions, emulated by 128-bit Wasm SIMD128. That is the path that LLVM is taking as well, as mentioned in the above phoronix link.

allsey87 commented 1 month ago

@gblikas I think you might be taking the wrong approach with all this. OpenCV has support for WebAssembly SIMD and does not require SSE or SSE2 mappings/emulation.

The problem is that when building OpenCV for Emscripten, CV_CPU_COMPILE_SSE2 is defined by default via the CMake logic which results in the incorrect header files being included.

The solution to your problem is to disable SSE by passing the following options to CMake:

-DCV_ENABLE_INTRINSICS=ON -DCPU_BASELINE= -DCPU_DISPATCH=

Hope this helps!

juj commented 1 month ago

For anyone working close on OpenCV, it would actually be amazing to read a writeup of how the SSE path vs handwritten Wasm SIMD path vs a ARM Neon path compilation in Emscripten works out with performance. That is, since Emscripten supports all of those, users can choose which SIMD backend to target.

It would be a really nice investigation to see "here's how much of a performance difference there is if you use Wasm SIMD instead of SSE" in different OpenCV tasks.

allsey87 commented 1 month ago

That is, since Emscripten supports all of those

Well, the SSE path for OpenCV is broken at the moment with some necessary functions missing (see https://github.com/emscripten-core/emscripten/pull/22243). I haven't tried the NEON path yet.