Open AlanCui4080 opened 1 month ago
@AlanCui4080 I don't see any reason. Perhaps the function was first developped for M0 and was not upgraded to support all other architectures.
@christophe0606
Sorry for my mistake, it's ok to include arm_neon.h on M4F, even there is only a subset of NEON implemented on M4F, but the vcvt_s32_f32 will call a SIMD instruction "FCVTZS Vd.2S,Vn.2S" which is invalid on M4F. The only vaild one is "VCVT.F32.S32 Sd,Sm #fbits" included in FPv4-SP. I'm testing it on my STM32G4, if that be ok, i will put a pull request.
@AlanCui4080 I won't include arm_neon.h
for M4F.
There are thus two possibilities : this intrinsics is supported by the Arm C language extensions (ACLE). Unfortunately, there are still too many compilers that are not fully implementing all of the ACLE.
That's why most of the intrinsics used by CMSIS-DSP are coming from CMSIS-Core (part of CMSIS-6). And probably, you'll need to open an issue on CMSIS-Core if you want this new intrinsic to be supported
@christophe0606 I figure out it, ACLE have no single precision version (FPv4-SP) for vcvt_s32_f32, it's either d-register or q-register in ACLE. So I do believe someone forget to add it into ACLE, because this instruction is unpopular and not wide used.
And the CMSIS-Core seems to be not include any part of FPU instructions, where should i put it in.
Note: following inline asm is proved usable as a replacement of arm_float_to_q31.
asm("vcvt.s32.f32 %0, %0, #31": "+t"(thetaf_divpi.f)::);
@christophe0606 Hi, Is that ok to add a inline assembly specilized for M4? If so, i will open a pull request.
@AlanCui4080 asm definitions are provided through CMSIS-Core by defining new intrinsics macros in the compiler headers
So you can try to open a github issue on the CMSIS repository.
@christophe0606, @AlanCui4080, where does this new instrinsic belong to? Is this one the correct location? https://github.com/ARM-software/CMSIS_6/blob/8c4dc58928b3347f6aa98b6fb2bf6770f32a72b7/CMSIS/Core/Include/cmsis_gcc.h#L877-L992
Please feel free to raise a PR adding the required define. Please bare in mind that this is location not specific to Cortex-M4F. Nor is is specialized for Arm-v7EM but is pulled for A- and R-class as well. If this is strictly for Arm-v7EM, it would need to go into https://github.com/ARM-software/CMSIS_6/blob/main/CMSIS/Core/Include/m-profile/cmsis_gcc_m.h et al.
@JonatanAntoni @AlanCui4080 It must be added to each compiler and would be for Cortex-M only. (Since the intrinsics is defined with Neon header).
I don't have compiler specific implementation (except in case of major compiler bug) in CMSIS-DSP so this new intrinsics should be made available for all compiler so that CMSIS-DSP can still be built with all of them.
Otherwise, it is now possible to define functions as WEAK for linker in CMSIS-DSP. So, another possibility is for a user to replace a specific function with a different implementation if for some reason we can't do it on our side.
@christophe0606, @AlanCui4080, where does this new instrinsic belong to? Is this one the correct location? https://github.com/ARM-software/CMSIS_6/blob/8c4dc58928b3347f6aa98b6fb2bf6770f32a72b7/CMSIS/Core/Include/cmsis_gcc.h#L877-L992
@JonatanAntoni No, that is not a DSP instruction, it's belong to FPU. I have never seen intrinsics for FPUs in CMSIS-Core. I think that's because CMSIS-DSP is designed for fixed-point at beginning, and FPU instructions are duty of compliers.
Meanwhile, there is no complier include intrinsics for FPU instuctions (But have MVE&NEON) to itself. Well, so i think people all focused on newer architecture such one have MVE. So armv7 old FPU just be forgot like 8087 insturctions.
See https://developer.arm.com/documentation/ddi0439/b/BEHJADED, since there is no absolute value statement, square root statement, or q31 to float statement in C language, compliers will never generate such instructions to accelerate program running. For newlib, they wrote a inline asm for sqrtf() see https://github.com/bminor/newlib/blob/master/newlib/libm/machine/arm/e_sqrt.c, this fact also pointed out that "there is no complier include intrinsics for FPU instuctions". By the way, i didn't see they did the same thing for "FABS.F32" or "FCVT.XX.XX", may because just no one cared.
These FPU instructions are orphans, i have no idea about where to add them.
Hi,
As what we know, Cortex-M4F implemented a little set of FPU instructions including VCVT.F32.S32, and GCC did have a builtin intrinsic for it. But the question is why the intrinsic is only enabled when NEON available in GCC, and also, why not to use it in arm_float_to_q31.
Alan.