Closed p0nce closed 1 year ago
Code
else version(LDC) { // this doesn't benefit GDC (unable to inline), but benefits both LDC with SSE2 and ARM64 __m128 A_lo = _mm256_extractf128_ps!0(a); __m128 A_hi = _mm256_extractf128_ps!1(a); _mm256_print_ps(a); _mm_print_ps(A_lo); _mm_print_ps(A_hi); import core.stdc.stdio; printf("Mask = %d\n", _mm_movemask_ps(A_hi)); printf("Mask = %d\n", _mm_movemask_ps(A_lo)); return (_mm_movemask_ps(A_hi) << 4) | _mm_movemask_ps(A_lo); }
on x86:
-1 -inf 0 -1 1 inf -2 -nan(ind) -1 -inf 0 -1 1 inf -2 -nan(ind) Mask = 12 Mask = 11
on arm64:
-1 -inf 0 -1 1 inf -2 nan -1 -inf 0 -1 1 inf -2 nan Mask = 4 Mask = 11
NaN sign is not preserved
on aarch64 + linux, NaN loose their sign bit, but not on Apple Silicon... nice
okay
Code
on x86:
on arm64:
NaN sign is not preserved