Floating point ABI selection inconsistent

kjbracey commented 5 years ago

Description of defect

Floating point ABI selection (hardfp/softfp) appears to be inconsistent between chips and toolchains. Some sort of alignment correction is required, but it's not clear in which direction.

Current state is, I think (can someone crosscheck?):

IAR setup does not specify ABI, so hardfp is used if the selected CPU has FP.
GCC setup specifies softfp always for all CPUs.
ARMC5 setup does not specify ABI, so hardfp is used if the selected CPU has FP.
ARMC6 setup specifies hardfp for M4F, softfp for M33F and M7F

ABI compatibility probably isn't a huge deal, as aside from the "multi-ABI" compiler runtime+C libraries, I believe we do not deal with precompiled binaries other than those for a specific target, so locked down to a particular CPU. (Are there any exceptions?)

If we were to attempt to share a binary across multiple chips, we would want to select softapi ABI always, like GCC, and ensure shared binaries didn't use FP instructions. But is that a requirement? Only existing case I'm aware of was Nanostack, which was carefully compiled for ARMv6 and ARMv7 without using FP, but that's now gone.

If not trying to have multi-CPU binaries, then the "use hardfp ABI if using FP instructions" default for IAR and ARMC5 seems reasonable.

It doesn't make sense to treat the toolchains differently, but any change in either direction would lead to problems with precompiled binaries. Such a change maybe should be reserved to whenever another incompatible change forces recompilation.

But at a minimum we do need to get ARMC5 and ARMC6 in line - that means using hardfp whenever we have an fpu, so changing M33F and M7F

Target(s) affected by this defect ?

n/a

Toolchain(s) (name and version) displaying this defect ?

n/a

What version of Mbed-os are you using (tag or sha) ?

n/a

What version(s) of tools are you using. List all that apply (E.g. mbed-cli)

n/a

How is this defect reproduced ?

n/a

cmonr commented 5 years ago

@ARMmbed/mbed-os-core

deepikabhavnani commented 5 years ago

CC @theotherjimmy

bulislaw commented 5 years ago

Do we know what are the size implication of using softfp vs hardfp? I agree with Kevin's suggestion to use hardfp whenever possible.

cmonr commented 5 years ago

@bulislaw I'd expect that code and ram sizes decrease since we're replacing software routines for hardware-accelerated opcodes.

bulislaw commented 5 years ago

Yeah, i'm just curious to see what the difference is (not enough to try it though :D ). It would be also interesting to see how often people use FP (i'm guessing mostly to get measurements from sensors, that require divisions or PIDs).

kjbracey commented 5 years ago

Most people instinctively avoid FP, but some parts of the code base have quite a bit (eg LoRAWAN's timing code). And as long as they're avoiding double (not in HW normally) and sticking to float that's fine.

Size difference is negligible afaict - can be either slightly plus or minus depending on app.

I'd expect that code and ram sizes decrease since we're replacing software routines for hardware-accelerated opcodes.

This is only about the calling convention between functions - we're already using hardware floating point inside functions when available.

There are 5 options for compilation 1) Don't use hardware FP instructions 2s) Use single FP instructions, do doubles in software, pass values in integer registers in ABI 2d) Use single and double FP instructions, pass values in integer registers in ABI 3s) Use single FP instructions, do doubles in software, pass values in FP registers in ABI 3d) Use single and double FP instructions, pass values in FP registers in ABI

For ARM C 5 and IAR, we currently use 1 or 3. For GCC we currently use 1 or 2. Depends on target CPU, including s or d variant.

The only real reason to use type 2 is for interworking precompiled binaries - type 1 and type 2 can interwork, so you can freely mix them. But you'd still need to take care anyway about not trying to use a type 2 library on a machine without hardware FP, unless you have a FP emulator to catch the instructions (as desktop systems like ARM Linux or RISC OS do). On those systems type 1 or type 2 is then just a choice of "optimise for non-FP hardware" or "optimise for FP hardware".

But we're not currently set up to do that - I don't believe anyone is attempting to provide target-independent type-1 binaries, and we certainly don't have an FP emulator. All our binaries are either target-specific, or multiple-ABI binary sets like the C library. (Nanostack binaries were type-1, but had no FP in its external API anyway, and maybe none internally).

Using a type-1 binary when you actually have hardware FP would actually cause code size bloat anyway, as it would pull in software FP routines. (Not sure if the C libraries have "implement software FP using hardware FP instructions" versions to help that case). So shouldn't be encouraged. In our "basically all one application" model, it makes sense to use type 3.

Type 2s might also be a bit faster than 3s if the code is using doubles - the compiler has to do the double arithmetic in software using integers, then transfer the value into an FP register for a function call. But that extra cost is minimal - if you were worried about performance you'd be using floats.

kjbracey commented 5 years ago

@theotherjimmy - Thanks for approving #9154. I understand that we prefer "hardfp ABI if using FP", but GCC remains using softfp just for backwards compatibility at this point. Is that right?

State after #9154:

IAR setup does not specify ABI, so hardfp is used if the selected CPU has FPU.
ARMC5 setup does not specify ABI, so hardfp is used if the selected CPU has FPU.
ARMC6 setup specifies hardfp whenever specifying FPU
GCC setup specifies softfp always for all CPUs.

I wonder if it would be possible to change GCC to match - how many precompiled GCC binaries for M4F/M7F targets are there that would need to be recompiled? Would it make sense to move that way piecemeal? Eg M33F hardfp now (because no old binaries), shift others as+when binaries checked? Or is the toolchain inconsistency too minor to worry about, so leave GCC softfp for all targets? Using softfp will only really affect FP-intensive code, after all.

0xc0170 commented 5 years ago

I wonder if it would be possible to change GCC to match - how many precompiled GCC binaries for M4F/M7F targets are there that would need to be recompiled? Would it make sense to move that way piecemeal? Eg M33F hardfp now (because no old binaries), shift others as+when binaries checked? Or is the toolchain inconsistency too minor to worry about, so leave GCC softfp for all targets? Using softfp will only really affect FP-intensive code, after all.

I would give it a shot . We can check how many libs are there for M4F/M7F - I can think of at least Odin that we could test?

ifyall commented 5 years ago

@kjbracey-arm, @0xc0170 and @maclobdell , It is much more preferable to make the softfp vs hardfp decision an application level choice. Depending on your user application you might have a strong need for one versus the other. For instance, a floting-point heavy application would likely want the hardfp choice for improved performance. Another application may be dependent on 3rd-party libraries that are not sourced based and don't have a hardfp variant, requiring them to use softfp.

This fundamentally needs to be a user choice. Ian

kjbracey commented 5 years ago

I largely agree, and there's nothing stopping you making local edits to the python to modify it.

The reason to not have such a setting is our own binary blob issue. To expose such a setting as Mbed OS API would mean one of the following:

1) Requiring all targets with binary blobs to provide softfp and hardfp ABI versions (if they have FP hardware). 2) Documenting that targets with binary blobs may not support manual switching of ABI mode. (The linker would presumably complain itself).

(And it would be another build variant to test - unlikely the non-default ABI would actually be tested by CI, given CI load).

ifyall commented 5 years ago

@kjbracey-arm, could we make softfp vs hardfp an option at the target level, much like features such as BLE are? That way not all vendors are required to provide blobs? This would also limit the test variants in your CI to only be done on targets that support it.

For a customer who needs to change the option today, can ARM provide instructions on what they should do? I am not certain it is obvious enough.

Thank you,

Ian

kjbracey commented 5 years ago

Target-level selection is a valid feature request - I would suggest pushing for this via vendor/partner channels if you want resource allocated from our tools team, but we'd welcome a patch.

My initial suggestion is a "target.float-abi" option, which you can set to "hard", "softfp" or "soft" matching ARMC6/GCC naming. Would apply to all toolchains - if you want to be specifying this, there's no point supporting different for different toolchains. Default value would be none, which would give the backwards compatible behaviour. Targets or applications could override. Python code would deal with mapping to IAR/ARMC5 equivalents.

Doing it for the command line build is probably fairly straightforward - making it also work in every exporter would be a bigger job.

For a customer who needs to change the option today, can ARM provide instructions on what they should do? I am not certain it is obvious enough.

For the command-line build you need to edit tools/toolchains/arm.py, or gcc.py or iar.py.

GCC.__init__ and ARMC6.__init__ explicitly add -mfloat-abi=softfp or -mfloat-abi=hardrespectively to to the common flags whenever compiling for a target with FP. You can just change that there.

IAR.__init__ doesn't set any options explicitly, so the compiler defaults to hard if an FPU is present. The option for IAR is --apcs=std or --apcs=vfp, which are equivalent to softfp and hard respectively. (soft would be achieved by --fpu=none. I hope that would override any FP indication in the CPU name like --cpu=Cortex-M4F.) You would add the --aapcs option to c_flags_cmd alongside --thumb. --fpu=none would go to both c_flags_cmd and asm_flags_cmd.

ARM.__init__ (for ARMC5) doesn't set any options explicitly, so the compiler defaults to hardfp if an FPU is present. The option for ARMC5 is --apcs=/softfp or --apcs=/hardfp. To get "mfloat-abi=soft" behaviour you would pass --fpu=softvfp. Those options should be added to self.flags['common'] so they get passed to both the compiler and assembler.

For exporters, the above may or may not work, depending on how the exporter/IDE works. You might need to edit other settings in XML files.

aqib-ublox commented 4 years ago

@MarceloSalazar still active ?

MarceloSalazar commented 4 years ago

Remains valid afaik. @kjbracey-arm please comment if you think otherwise.

kjbracey commented 4 years ago

There's been no change in the codebase since my last comment, afaik. (The original description isn't up-to-date - it's updated by my 3 Jan 2019 comment).

I'd say it's still "patches welcome", and there's no ARM resource allocated that I'm aware of.

oclyke commented 4 years ago

Just chiming in that this issue proved useful to me. I desired to use the -mfloat-abi=hard option rather than soft to compile a --library version of mbed under GCC. I accomplished this by following the instructions to modify the Python scripts though it would be very nice to be able to achieve this without modifying mbed source.

ciarmcom commented 4 years ago

@kjbracey-arm thank you for raising this issue.Please take a look at the following comments:

Could you add some more detail to the description? A good description should be at least 25 words. What target(s) are you using? What toolchain(s) are you using? What version of Mbed OS are you using (tag or sha)? It would help if you could also specify the versions of any tools you are using? How can we reproduce your issue?

NOTE: If there are fields which are not applicable then please just add 'n/a' or 'None'.This indicates to us that at least all the fields have been considered. Please update the issue header with the missing information, the issue will not be mirroredto our internal defect tracking system or investigated until this has been fully resolved.

adbridge commented 4 years ago

We've updated our automation, I will fix the requirements .

ciarmcom commented 4 years ago

Thank you for raising this detailed GitHub issue. I am now notifying our internal issue triagers. Internal Jira reference: https://jira.arm.com/browse/IOTOSM-2461

ciarmcom commented 3 years ago

We closed this issue because it has been inactive for quite some time and we believe it to be low priority. If you think that the priority should be higher, then please reopen with your justification for increasing the priority.

ARMmbed / mbed-os