intel / ARM_NEON_2_x86_SSE

The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using SIMD up to AVX2 intrinsic functions
Other
430 stars 149 forks source link

Rounding error in vqrshrun_n_s64 #71

Closed joho-11 closed 2 months ago

joho-11 commented 2 months ago

I believe that line 8770 in NEON_2_SSE.h should changed from this: res64 = (atmp[1] >> b) + ( (atmp[0] & ((int64_t)1 << (b - 1))) >> (b - 1) ); to this: res64 = (atmp[1] >> b) + ( (atmp[1] & ((int64_t)1 << (b - 1))) >> (b - 1) );

Test with a[0] = 291408416384, a[1] = 611251267456, b = 16 res.m64_u32[1]= 9326955, should be 9326954

Zvictoria commented 2 months ago

@joho-11 thanks for reporting, to be fixed in the next commit