AuburnSounds / intel-intrinsics

The Dlang SIMD library
https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#techs=MMX,SSE,SSE2,SSE3,SSSE3,SSE4_1
Boost Software License 1.0
68 stars 11 forks source link

_mm256_movemask_ps failure on arm64 #113

Closed p0nce closed 1 year ago

p0nce commented 1 year ago

Code

else version(LDC)
    {
         // this doesn't benefit GDC (unable to inline), but benefits both LDC with SSE2 and ARM64
        __m128 A_lo = _mm256_extractf128_ps!0(a);
        __m128 A_hi = _mm256_extractf128_ps!1(a);
        _mm256_print_ps(a);
        _mm_print_ps(A_lo);
        _mm_print_ps(A_hi);
        import core.stdc.stdio;
        printf("Mask = %d\n", _mm_movemask_ps(A_hi));
        printf("Mask = %d\n", _mm_movemask_ps(A_lo));
        return (_mm_movemask_ps(A_hi) << 4) | _mm_movemask_ps(A_lo);
    }

on x86:

-1 -inf 0 -1 1 inf -2 -nan(ind)
-1 -inf 0 -1
1 inf -2 -nan(ind)
Mask = 12
Mask = 11

on arm64:

-1 -inf 0 -1 1 inf -2 nan
-1 -inf 0 -1
1 inf -2 nan
Mask = 4
Mask = 11

NaN sign is not preserved

p0nce commented 1 year ago

on aarch64 + linux, NaN loose their sign bit, but not on Apple Silicon... nice

p0nce commented 1 year ago

okay