Closed ole2410 closed 1 year ago
Hello @ole2410 and thanks for the report
According to our notes in https://github.com/simd-everywhere/simde/commit/b471fcfd5c1602d17d59038e59b1e005f3dc4fd9
MMX is not available on MSVC in 64-bit mode
And _mm_loadh_pi
uses __m64 *
, which is a MMX type
Maybe things have changed. So I've kicked off a test of skipping those defined(SIMDE_X86_MMX_NATIVE)
requirements for SSE* intrinsics at https://ci.appveyor.com/project/mr-c/simde/builds/46987490
That failed, so I tested only ungating simde_mm_loadh_pi
(as that does appear on https://learn.microsoft.com/en-us/cpp/intrinsics/x64-amd64-intrinsics-list?view=msvc-170 , unlike every other intrinsic that uses MMX types) over in https://ci.appveyor.com/project/mr-c/simde/builds/46987731 and so far that appears to be working.
If all goes well, I'll include this fix for the upcoming SIMDe 0.7.6 release later this week
Using MSVC x64, the simde_mm_loadh_pi instruction is not processed correctly. It will not be substituted by _mm_loadh_pi, and poor performance is the result.
Only MSVC x64 compiler seems to be affected. It works properly with MSVC x86 and other compilers I tested.
Here is a short example code for Code Explorer:
Expected behaviour would be, that the simde_mm_loadh_pi instruction should result in a movhps instruction, which will happen on all compilers except for MSVC x64.
The related simde_mm_loadl_pi instruction seems to work fine.