intel / ARM_NEON_2_x86_SSE

The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using SIMD up to AVX2 intrinsic functions
Other
430 stars 149 forks source link

vshrn_n_u16 does signed arithmetic right shift #68

Closed Matshias closed 2 months ago

Matshias commented 5 months ago

The vshrn_n_u16 intrinsic uses vshrq_n_s16(a,b) internally hence doing a signed arithmetic right shift. It should be a logical/unsigned right shift.

Zvictoria commented 4 months ago

Hi,Matshias thanks for noticing it. And have you seen any real bugs there with vshrn_n_u16? Because if you do the shift up to 8 bits as function header says _NEON2SSESTORAGE uint8x8_t vshrn_n_u16(uint16x8_t a, __constrange(1,8) int b); , it doesn't matter what function to use - the unsigned or signed one. It matters for shifts >8 only