The intrinsics for scalar converts between fp16 and integer/fixed-point values are currently specified to always use 16bit -> 16bit converts, regardless of the size of the integer/fixed-point value. For example:
Using the instruction listed causes certain input values to be treated incorrectly. For the above example, an input int32 65504 produces an fp16 value of -32.0, instead of the expected 65504.0.
Instead, the above intrinsic should use the SCVTF Hd,Wn instruction, which better matches the input type.
This applies to all scalar converts between fp16 and 32-bit/64-bit integer/fixed-point converts:
Testing with two mainstream compilers (gcc and clang/llvm) shows that these intrinsics are often already generating the proposed instructions, rather than the instructions listed in the ACLE. In particular:
GCC (tested with 9.2.0) generates the proposed instructions for all of the intrinsics.
clang/llvm (tested with 14.0.0) generates the proposed instructions for the integer converts, but generates the ACLE instructions for the fixed-point converts.
The intrinsics for scalar converts between fp16 and integer/fixed-point values are currently specified to always use 16bit -> 16bit converts, regardless of the size of the integer/fixed-point value. For example:
https://github.com/ARM-software/acle/blob/6eb85169a112395c8bce978fba19154efcd725ea/tools/intrinsic_db/advsimd.csv#L3795
Using the instruction listed causes certain input values to be treated incorrectly. For the above example, an input int32 65504 produces an fp16 value of -32.0, instead of the expected 65504.0.
Instead, the above intrinsic should use the
SCVTF Hd,Wn
instruction, which better matches the input type.This applies to all scalar converts between fp16 and 32-bit/64-bit integer/fixed-point converts:
Testing with two mainstream compilers (gcc and clang/llvm) shows that these intrinsics are often already generating the proposed instructions, rather than the instructions listed in the ACLE. In particular: GCC (tested with 9.2.0) generates the proposed instructions for all of the intrinsics. clang/llvm (tested with 14.0.0) generates the proposed instructions for the integer converts, but generates the ACLE instructions for the fixed-point converts.