llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.38k stars 12.15k forks source link

Issues with ARM hardware floating point support #55755

Open itaig1 opened 2 years ago

itaig1 commented 2 years ago

First a bit of background information, I'm developing a RTOS system for the ARM Cortex M series cores and will also be working on the Cortex R cores at a later date. Anyway to cut a long story short, I came across several issues when dealing with floating point support which appears to differ from GCC and the ARM ABI standards so I looked into this deeper and came up with the following:

This actually could be considered several bugs but are all connected. I have checked these issues on versions 11.0.0, 12.0.0, 13.0.0, 14.0.0 and 15.0.0 with option -target arm-none-eabi -O3 -std=gnu++20 -x c++ also tested with the following options:

For CPU -mcpu=cortex-m3 -mcpu=cortex-m4 -mcpu=cortex-m4+nofp -mcpu=cortex-m4 -mfpu=none -mcpu=cortex-m7 -mcpu=cortex-m7+nofp -mcpu=cortex-m7+nofp.dp -mcpu=cortex-m7 -mfpu=none

In addition to the above setting I have also tried with either one or none of the following: -mfloat-abi=soft -mfloat-abi=softfp -mfloat-abi=hard

For comparison the version GCC compiler used was 10.3.1 arm-none-eabi with exactly the same options minus the -target arm-none-eabi

Each of the tests was done with simple functions using floating point so code validity could be checked and also performing a check using -dM to dump the predefined macros.

  1. When compiling with the command line option -mfloat-abi=hard the predefined macro __ARM_PCS_VFP to 1 is always defined even on devices without hardware floating point or the options like -mfpu=none are used. Although the compiler doesn't emit floating point instructions in this case the macro __ARM_PCS_VFP indicates that we are using hardware FP registers for passing parameters even when they don't exist. As a side note, GCC will generate an error if you use -mfloat-abi=hard in these cases. I have not checked what effect this has on the created object files. Currently I resolve this using:
#ifndef __ARM_FP
#ifdef __ARM_PCS_VFP
#undef __ARM_PCS_VFP
#warning Unexpected __ARM_PCS_VFP is defined, undefining __ARM_PCS_VFP
#endif
#endif
  1. Not sure, it could be because GCC defaults to -mfloat-abi=soft and Clang defaults to -mfloat-abi=softfp however on GCC __SOFTFP__ and __ARM_FP will have one and only one of these defined, Clang will not define __SOFTFP__ unless the option -mfloat-abi=soft is used. GCC will also define __SOFTFP__ even if the option -mfloat-abi=softfp is used on a device without an FPU. While this can be resolved by checking if __ARM_FP is undefined instead of checking if __SOFTFP__ is defined, this may cause some compatibility issues with GCC. This also had other issues on versions of Clang prior to 12.0.0

  2. There is a difference to how GCC and Clang define __ARM_FP, ACLE specifies that bit 2 indicates double precision FPU however on a Cortex M4 which only supports single precision arithmetic has bit 2 set on Clang and it is clear on GCC, the same is true for a Cortex M7 when a single precision FPU is specified instead of the double precision one. The generated code from the compiler appears correct just the define value in __ARM_FP appears different. The single precision FPUs do have "double" registers but they are only 2 single precision registers so the only valid instructions are to move double registers which allows the moving of 2 single precision registers with one instruction. I do not think this constitutes double capable FPU. This makes it hard to conditionally compile on whether doubles are supported or not by the FPU, this can be a real problem for assembly routines. Not sure if the Clang or the GCC method of reporting is correct according to ACLE however the GCC way does seem more logical and useful.

I did find other issues but they seemed to have been resolved since version 12.0.0 onwards so will not mention them here and those can be resolved by checking the Clang version or at least report a conflict.

I have been able to work around most of it except for item 3 when there are different FPU variations for the core such as the Cortex M7 and probably really since Clang 12.0.0 would probably only be a real issue if dealing with assembly language as the compiler does appear to generate the correct code for C/C++.

Regards,

itaig1

Edits: Fixed wording and some formatting

llvmbot commented 2 years ago

@llvm/issue-subscribers-backend-arm

itaig1 commented 2 years ago

I have done a further testing using Clang 14.0.0 showing that the -mfloat-abi=hard does carry throught to the generated object file. The main issue is the presence of Tag_ABI_VFP_args: VFP registers in the hard ABI mode, this should not even be possible on a device which doesn't support hardware floating point.

The tests I performed are outlined below:

Source of test1.cpp:

double x;

double foo(double y)
{
    x += y;
    return x;
}

Test1: Commands: clang -target arm-none-eabi -O3 -Wall -std=gnu++20 -c -mcpu=cortex-m3 -x c++ -mfloat-abi=hard test1.cpp -o test1hard.o readelf -A test1hard.o Output from readelf:

Attribute Section: aeabi
File Attributes
  Tag_conformance: "2.09"
  Tag_CPU_name: "cortex-m3"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Microcontroller
  Tag_ARM_ISA_use: No
  Tag_THUMB_ISA_use: Thumb-2
  Tag_ABI_PCS_R9_use: V6
  Tag_ABI_PCS_GOT_use: direct
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Unused
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_VFP_args: VFP registers
  Tag_ABI_optimization_goals: Aggressive Speed
  Tag_CPU_unaligned_access: None
  Tag_ABI_FP_16bit_format: IEEE 754

Test2: Commands: clang -target arm-none-eabi -O3 -Wall -std=gnu++20 -c -mcpu=cortex-m3 -x c++ -mfloat-abi=soft test1.cpp -o test1soft.o readelf -A test1soft.o Output from readelf:

Attribute Section: aeabi
File Attributes
  Tag_conformance: "2.09"
  Tag_CPU_name: "cortex-m3"
  Tag_CPU_arch: v7
  Tag_CPU_arch_profile: Microcontroller
  Tag_ARM_ISA_use: No
  Tag_THUMB_ISA_use: Thumb-2
  Tag_ABI_PCS_R9_use: V6
  Tag_ABI_PCS_GOT_use: direct
  Tag_ABI_PCS_wchar_t: 4
  Tag_ABI_FP_denormal: Needed
  Tag_ABI_FP_exceptions: Unused
  Tag_ABI_FP_number_model: IEEE 754
  Tag_ABI_align_needed: 8-byte
  Tag_ABI_align_preserved: 8-byte, except leaf SP
  Tag_ABI_enum_size: int
  Tag_ABI_optimization_goals: Aggressive Speed
  Tag_CPU_unaligned_access: None
  Tag_ABI_FP_16bit_format: IEEE 754

Hopes this helps a bit more.

Regards,

itaig1

itaig1 commented 2 years ago

I made a mistake with item 3 it is bit 3 not bit 2 however even still it is not always correct. The 2 cases I have found that are incorrect are the following: -mcpu=cortex-m7+nofp.dp __ARM_FP is defined as 0xE -mcpu=cortex-m55+nofp.dp __ARM_FP is defined as 0xE in both cases double precision is being shown as present. The generated code though will correctly use a library call for double precision it is just what __ARM_FP is defined to that is wrong.

Regards,

itaig1

itaig1 commented 2 years ago

In addition to last post, using command line option -mcpu=cortex-m7 -mfpu=fpv4-sp-d16 does work correctly however -mcpu=cortex-m7+nofp.dp does not. Both generate correct code only the value of __ARM_FP is wrong in the second case.

Regards

itaig1

john-brawn-arm commented 2 years ago

Looking at this, my thoughts are:

  1. We should be giving an error on -mfloat-abi=hard when the target doesn't have floating-point hardware like GCC does.
  2. We should be doing like gcc does and define __SOFTFP__ for both -mfloat-abi=soft and -mfloat-abi=softfp.
  3. It looks the internal target features for handling single-precision-only are a mess. ARM::appendArchExtFeatures in llvm/lib/Support/ARMTargetParser.cpp expects that double-precision can be removed by adding the "-fp64" feature, but ARMTargetInfo::handleTargetFeatures in clang/lib/Basic/Targets/ARM.cpp expects that single-precision-only is indicated by presence of the sp version of the vfp version and absense of the dp verision (e.g. "+vfp4d16sp" is present and "+vfp4d16" is absent). It looks like this disagreement is what's causing __ARM_FP to be set incorrectly.
stuij commented 2 years ago

I created a review on Phabricator to handle __SOFTFP__ compatibility with GCC: https://reviews.llvm.org/D135680

@itaig1: feel free to add yourself as a reviewer. If you are registered on Phabricator, I failed in guessing your username there.

mplatings commented 1 year ago

Warn if -mfloat-abi=hard is specified when floating point registers are not supported: https://reviews.llvm.org/D150902