Open itaig1 opened 2 years ago
@llvm/issue-subscribers-backend-arm
I have done a further testing using Clang 14.0.0 showing that the -mfloat-abi=hard does carry throught to the generated object file. The main issue is the presence of Tag_ABI_VFP_args: VFP registers
in the hard ABI mode, this should not even be possible on a device which doesn't support hardware floating point.
The tests I performed are outlined below:
Source of test1.cpp:
double x;
double foo(double y)
{
x += y;
return x;
}
Test1:
Commands:
clang -target arm-none-eabi -O3 -Wall -std=gnu++20 -c -mcpu=cortex-m3 -x c++ -mfloat-abi=hard test1.cpp -o test1hard.o
readelf -A test1hard.o
Output from readelf:
Attribute Section: aeabi
File Attributes
Tag_conformance: "2.09"
Tag_CPU_name: "cortex-m3"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Microcontroller
Tag_ARM_ISA_use: No
Tag_THUMB_ISA_use: Thumb-2
Tag_ABI_PCS_R9_use: V6
Tag_ABI_PCS_GOT_use: direct
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Unused
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_VFP_args: VFP registers
Tag_ABI_optimization_goals: Aggressive Speed
Tag_CPU_unaligned_access: None
Tag_ABI_FP_16bit_format: IEEE 754
Test2:
Commands:
clang -target arm-none-eabi -O3 -Wall -std=gnu++20 -c -mcpu=cortex-m3 -x c++ -mfloat-abi=soft test1.cpp -o test1soft.o
readelf -A test1soft.o
Output from readelf:
Attribute Section: aeabi
File Attributes
Tag_conformance: "2.09"
Tag_CPU_name: "cortex-m3"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Microcontroller
Tag_ARM_ISA_use: No
Tag_THUMB_ISA_use: Thumb-2
Tag_ABI_PCS_R9_use: V6
Tag_ABI_PCS_GOT_use: direct
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Unused
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_optimization_goals: Aggressive Speed
Tag_CPU_unaligned_access: None
Tag_ABI_FP_16bit_format: IEEE 754
Hopes this helps a bit more.
Regards,
itaig1
I made a mistake with item 3 it is bit 3 not bit 2 however even still it is not always correct.
The 2 cases I have found that are incorrect are the following:
-mcpu=cortex-m7+nofp.dp
__ARM_FP is defined as 0xE
-mcpu=cortex-m55+nofp.dp
__ARM_FP is defined as 0xE
in both cases double precision is being shown as present. The generated code though will correctly use a library call for double precision it is just what __ARM_FP
is defined to that is wrong.
Regards,
itaig1
In addition to last post, using command line option -mcpu=cortex-m7 -mfpu=fpv4-sp-d16
does work correctly however -mcpu=cortex-m7+nofp.dp
does not. Both generate correct code only the value of __ARM_FP
is wrong in the second case.
Regards
itaig1
Looking at this, my thoughts are:
I created a review on Phabricator to handle __SOFTFP__ compatibility with GCC: https://reviews.llvm.org/D135680
@itaig1: feel free to add yourself as a reviewer. If you are registered on Phabricator, I failed in guessing your username there.
Warn if -mfloat-abi=hard is specified when floating point registers are not supported: https://reviews.llvm.org/D150902
First a bit of background information, I'm developing a RTOS system for the ARM Cortex M series cores and will also be working on the Cortex R cores at a later date. Anyway to cut a long story short, I came across several issues when dealing with floating point support which appears to differ from GCC and the ARM ABI standards so I looked into this deeper and came up with the following:
This actually could be considered several bugs but are all connected. I have checked these issues on versions 11.0.0, 12.0.0, 13.0.0, 14.0.0 and 15.0.0 with option
-target arm-none-eabi -O3 -std=gnu++20 -x c++
also tested with the following options:For CPU
-mcpu=cortex-m3
-mcpu=cortex-m4
-mcpu=cortex-m4+nofp
-mcpu=cortex-m4 -mfpu=none
-mcpu=cortex-m7
-mcpu=cortex-m7+nofp
-mcpu=cortex-m7+nofp.dp
-mcpu=cortex-m7 -mfpu=none
In addition to the above setting I have also tried with either one or none of the following:
-mfloat-abi=soft
-mfloat-abi=softfp
-mfloat-abi=hard
For comparison the version GCC compiler used was 10.3.1 arm-none-eabi with exactly the same options minus the -target arm-none-eabi
Each of the tests was done with simple functions using floating point so code validity could be checked and also performing a check using -dM to dump the predefined macros.
-mfloat-abi=hard
the predefined macro__ARM_PCS_VFP
to1
is always defined even on devices without hardware floating point or the options like-mfpu=none
are used. Although the compiler doesn't emit floating point instructions in this case the macro__ARM_PCS_VFP
indicates that we are using hardware FP registers for passing parameters even when they don't exist. As a side note, GCC will generate an error if you use-mfloat-abi=hard
in these cases. I have not checked what effect this has on the created object files. Currently I resolve this using:Not sure, it could be because GCC defaults to
-mfloat-abi=soft
and Clang defaults to-mfloat-abi=softfp
however on GCC__SOFTFP__
and__ARM_FP
will have one and only one of these defined, Clang will not define__SOFTFP__
unless the option-mfloat-abi=soft
is used. GCC will also define__SOFTFP__
even if the option-mfloat-abi=softfp
is used on a device without an FPU. While this can be resolved by checking if__ARM_FP
is undefined instead of checking if__SOFTFP__
is defined, this may cause some compatibility issues with GCC. This also had other issues on versions of Clang prior to 12.0.0There is a difference to how GCC and Clang define
__ARM_FP
, ACLE specifies that bit 2 indicates double precision FPU however on a Cortex M4 which only supports single precision arithmetic has bit 2 set on Clang and it is clear on GCC, the same is true for a Cortex M7 when a single precision FPU is specified instead of the double precision one. The generated code from the compiler appears correct just the define value in__ARM_FP
appears different. The single precision FPUs do have "double" registers but they are only 2 single precision registers so the only valid instructions are to move double registers which allows the moving of 2 single precision registers with one instruction. I do not think this constitutes double capable FPU. This makes it hard to conditionally compile on whether doubles are supported or not by the FPU, this can be a real problem for assembly routines. Not sure if the Clang or the GCC method of reporting is correct according to ACLE however the GCC way does seem more logical and useful.I did find other issues but they seemed to have been resolved since version 12.0.0 onwards so will not mention them here and those can be resolved by checking the Clang version or at least report a conflict.
I have been able to work around most of it except for item 3 when there are different FPU variations for the core such as the Cortex M7 and probably really since Clang 12.0.0 would probably only be a real issue if dealing with assembly language as the compiler does appear to generate the correct code for C/C++.
Regards,
itaig1
Edits: Fixed wording and some formatting