The platform independent header allowing to compile any C/C++ code containing ARM NEON intrinsic functions for x86 target systems using SIMD up to AVX2 intrinsic functions
Other
430
stars
149
forks
source link
Fix vcvt_n functions to handle 32 fraction bits #50
When 32 bit conversion requested left shift to 32 bits overflows and result does not match real ARM output.
before:
./vcvt_test
Source: -0.000000 -0.429497 ->
VCVT N to 1 bit: 0 0
VCVT N to 31 bit: 0 3372630080
VCVT N to 32 bit: 0 0
Source: 0.000000 0.000000 ->
VCVT N to 1 bit: 0 0
VCVT N to 31 bit: 0 0
VCVT N to 32 bit: 0 0
Source: 0.214748 0.429497 ->
VCVT N to 1 bit: 0 0
VCVT N to 31 bit: 461168608 922337216
VCVT N to 32 bit: 0 0
after fix and ARM output:
Source: -0.000000 -0.429497 ->
VCVT N to 1 bit: 0 0
VCVT N to 31 bit: 0 0
VCVT N to 32 bit: 0 0
Source: 0.000000 0.000000 ->
VCVT N to 1 bit: 0 0
VCVT N to 31 bit: 0 0
VCVT N to 32 bit: 0 0
Source: 0.214748 0.429497 ->
VCVT N to 1 bit: 0 0
VCVT N to 31 bit: 461168608 922337216
VCVT N to 32 bit: 922337216 1844674432
When 32 bit conversion requested left shift to 32 bits overflows and result does not match real ARM output.
before:
after fix and ARM output:
Test app