intel / ARM_NEON_2_x86_SSE

The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using SIMD up to AVX2 intrinsic functions
Other
431 stars 150 forks source link

vmull_n_u16 function is wrong #24

Closed zjd1988 closed 5 years ago

zjd1988 commented 5 years ago

this function is used for U16, but the function implementation use S16,please check below。 ...... _NEON2SSESTORAGE uint32x4_t vmull_n_u16(uint16x4_t vec1, uint16_t val2); // VMULL.s16 q0,d0,d0[0] _NEON2SSE_INLINE uint32x4_t vmull_n_u16(uint16x4_t vec1, uint16_t val2) // VMULL.s16 q0,d0,d0[0] { uint16x4_t b16x4; b16x4 = vdup_n_s16(val2); return vmull_s16(vec1, b16x4); }

Zvictoria commented 5 years ago

Thanks for reporting! it is a misprint unnoticed for sure. Just replace vmull_s16 by vmull_u16 it should work. millions of thanks for reporting!!!

zjd1988 commented 5 years ago

@Zvictoria I want to thank you guys for creating ARM_NEON_2_x86_SSE . It's convenient to run and debug arm code in windows.