Open rikyborg opened 1 month ago
As mentioned on the thread, this seems to come from LLVM. Would you mind filing an issue there? Here is an llc reproduction that shows cortex-r5
is the cause https://llvm.godbolt.org/z/zzK7KGv95
Whoops, didn't mean to remove prioritize but I think the labels raced.
cc @chrisnc as the maintainer for this target.
We updated the float support in https://github.com/rust-lang/rust/pull/123159 to be more-correct for this target but it looks like we didn't quite stick the landing I guess?
Processors in this family include the Arm Cortex-R4, 5, 7, and 8.
We most definitely intentionally included the R5 here, though apparently we assume an R5F.
LLVM's policy for Arm target-cpu
is that it will enable the maximal set of features that the chosen CPU might support, and expects users to disable ones that they don't have/want, so it is correct and expected that the compiler assumes vfp3d16
when you specify cortex-r5
, even when using the non-hf
target. The term "R5F" is just a shorthand for "a Cortex-R5 with FPU support", and for LLVM this is the intended default. Arm doesn't define it as a separate CPU, nor does LLVM. (Edit: looks like R4(F) is an exception to this in LLVM.)
Note also that the examples still take the f32
arguments in the integer registers, which doesn't happen on the hf
target, so overall the code generated is correct. The only thing I don't understand is why target-feature=-vfp3d16
did not work as expected, so I will investigate that ~, and see if this is new as of #123159 or was always this way~. Edit: no, this can't be affected by that change, because we only modified the hf
targets to use slightly different default features so they would compose better with target-cpu
, but this case is about the non-hf
targets. Still checking on the behavior there...
--target armv7r-none-eabi -C opt-level=3 -C target-cpu=cortex-r5 -C target-feature=-vfp2sp,-fp64
https://godbolt.org/z/cj38s8Ev6
This will also disable the floating-point support. I think the issue is that the implied +vfp3d16
from cortex-r5
causes multiple other features to be enabled, but -vfp3d16
doesn't disable those dependencies. Just using +soft-float
in your project is still probably the right approach here, rather than referring to the VFP feature tree internal to LLVM.
Thanks a lot @chrisnc for investigating this issue!
Just using
+soft-float
in your project is still probably the right approach here, rather than referring to the VFP feature tree internal to LLVM.
Yes, right now that seems the clearest approach. There's the downside that rustc
will keep throwing the warning unknown and unstable feature specified for `-Ctarget-feature`: `soft-float`
. Is there any way to suppress that?
Or should the non-hf
target already include the +soft-float
feature? If the target does not have hard-float support, is there any reason to generate code using floating-point instructions? (I might be missing something here...)
As mentioned on the thread, this seems to come from LLVM. Would you mind filing an issue there?
Should I still go ahead and file an issue with LLVM as suggested initially by @tgross35?
One source of confusion for me is that LLVM seems to distinguish between Cortex-R4 and Cortex-R4F, but not between Cortex-R5 and Cortex-R5F.
WG-prioritization assigning priority (Zulip discussion).
@rustbot label -I-prioritize +P-medium
Yes, right now that seems the clearest approach. There's the downside that rustc will keep throwing the warning unknown and unstable feature specified for
-Ctarget-feature
:soft-float
. Is there any way to suppress that?
I don't think there's a way to disable this right now except to use nightly. I can't find an open issue for stabilizing these right now, but that is the path. This warning is relatively recent, and this issue gives some explanation of why it's there: https://github.com/rust-lang/rust/pull/117616.
Or should the non-hf target already include the +soft-float feature? If the target does not have hard-float support, is there any reason to generate code using floating-point instructions? (I might be missing something here...)
The hf
-ness is about the ABI, not what the target could support if the user asks for it (which was the case here). It's perfectly valid to use the soft-float ABI on a core that has an FPU, and the compiler can use the VFP instructions for things other than computation on f32
/f64
(or you might legitimately need to do floating-point computation while calling/being called by code that uses the soft-float ABI because you can't re-compile it). This is consistent with how clang/llvm handle this; the expectation is that you use a combination of target-cpu+target-feature to specify what you want, rather than having the ABI choice forcibly disable some features.
As mentioned on the thread, this seems to come from LLVM. Would you mind filing an issue there? Should I still go ahead and file an issue with LLVM as suggested initially by @tgross35?
One source of confusion for me is that LLVM seems to distinguish between Cortex-R4 and Cortex-R4F, but not between Cortex-R5 and Cortex-R5F.
I think that would be worthwhile, at least to understand why there is a separate cortex-r4f
but not for the others. I will speculate that this is for historical reasons and that they would remove it if not for backward compatibility, because it is inconsistent. Anecdotally, I think nofp cortex-r4 are much more common than nofp cortex-r5, which might also explain the exception, but I don't know the provenance of the one you are using. When you are using clang
, you would use -mcpu=cortex-r5+nofp
for your case, which obviates the need for a separate CPU name or explicitly disabling exactly the FPU features that are included with cortex-r5
.
Yes, right now that seems the clearest approach. There's the downside that rustc will keep throwing the warning unknown and unstable feature specified for
-Ctarget-feature
:soft-float
. Is there any way to suppress that?
We should probably just add this feature.
@chrisnc For note, LLVM seems to have flipflopped on the maximal vs. minimal thing for Arm CPUs at least twice now, and recent Arm CPUs (aarch64, mostly) are more likely to use a minimal featureset, though that may only be true from Armv9 onwards.
Yes, right now that seems the clearest approach. There's the downside that rustc will keep throwing the warning unknown and unstable feature specified for
-Ctarget-feature
:soft-float
. Is there any way to suppress that?We should probably just add this feature.
Whoops, I missed that this was also unknown, not just unstable, when I wrote my comment. I'm not sure this will be possible though, because it affects the ABI, and the reasoning here applies to Arm also.
Should rustc
consider something like clang
's -mcpu
which allows things like cortex-r5+nofp
? I'm not sure if this is exposed in a re-usable way by llvm, but it could go a long way in solving these types of problems.
For note, LLVM seems to have flipflopped on the maximal vs. minimal thing for Arm CPUs at least twice now, and recent Arm CPUs (aarch64, mostly) are more likely to use a minimal featureset, though that may only be true from Armv9 onwards.
Interesting... At least for the v7m, v7r, v8m, and v8r cores, the default feature sets have all been maximal when each core was added and then not changed except for refactoring and bug fixes from what I can see, but A-profile and Aarch64 are quite a bit more varied and may have different considerations. It seems that R4 as distinct from R4F was done originally in GCC in 2008 and then LLVM followed it, but no other exceptions exist, even though "M4F" and "R5F" and others are commonly mentioned when a non-trivial population of them are nofp
.
When targeting the Arm Cortex-R5 processor, rustc and llvm generate assembly containing floating-point instructions. These are not available on the Cortex-R5 (only on the Cortex-R5F) and cause the processor to halt.
Details
For example, the following code:
compiled with flags
--target armv7r-none-eabi -C opt-level=3 -C target-cpu=cortex-r5
generates the assembly (compiler explorer link):Note the
vadd.f32
andvadd.f64
instructions, that are available on the Cortex-R5F which has an FPU, but that are illegal instructions on the Cortex-R5 without an FPU.My expectation is that, using the target
armv7r-none-eabi
(rather thanarmv7r-none-eabihf
) and the target CPUcortex-r5
(rather thancortex-r5f
), rustc and llvm would generate legal instructions for the processor, i.e. use software floating-point features rather than hard-float instructions.Relevant information
Original thread on URLO: Unexpected codegen for Cortex-R5 without FPU.
The upstream LLVM CPU model for cortex-r5 includes the flag FeatureVFP3_D16, which seems OK for the R5F but wrong for the R5. However, manually disabling that feature doesn't seem to help (see below). Link to llvm repo at tag 18.1.7.
Information from the ARM Cortex-R Series Programmer's Guide: 6.1.6. VFP in the Cortex-R processors.
Things that do work
Skipping the target-cpu flag
Compiling the snippet above with flags `--target armv7r-none-eabi -C opt-level=3` generates: ```asm add_s: push {r11, lr} bl __aeabi_fadd pop {r11, pc} add_d: push {r11, lr} bl __aeabi_dadd pop {r11, pc} ```Adding the soft-float target feature
Compiling the snippet above with flags `--target armv7r-none-eabi -C opt-level=3 -C target-cpu=cortex-r5 -C target-feature=+soft-float` generates: ```asm add_s: push {r11, lr} bl __aeabi_fadd pop {r11, pc} add_d: push {r11, lr} bl __aeabi_dadd pop {r11, pc} ``` However, rustc generates the warning: ``` warning: unknown and unstable feature specified for `-Ctarget-feature`: `soft-float` | = note: it is still passed through to the codegen backend, but use of this feature might be unsound and the behavior of this feature can change in the future = help: consider filing a feature request warning: 1 warning emitted ```Things that don't work
Removing the vfp3d16 target feature
Compiling the snippet above with flags `--target armv7r-none-eabi -C opt-level=3 -C target-cpu=cortex-r5 -C target-feature=-vfp3d16` generates: ```asm add_s: vmov s0, r1 vmov s2, r0 vadd.f32 s0, s2, s0 vmov r0, s0 bx lr add_d: vmov d0, r2, r3 vmov d1, r0, r1 vadd.f64 d0, d1, d0 vmov r0, r1, d0 bx lr ```Meta
I've tested both stable and nightly toolchains, with same results.
rustc --version --verbose
: