rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
98.71k stars 12.76k forks source link

Can't use `vmsr` instruction in `global_asm!` on `armv7r-none-eabihf` without `codegen-units=1` #127269

Open jonathanpallant opened 4 months ago

jonathanpallant commented 4 months ago

See https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/armv7r-unknown-none-eabihf.20weirdness/near/448803070 for discussion and https://github.com/ferrous-systems/armv7r-issues for a reproducer.

I tried this code:

core::arch::global_asm!(
    r#"

.section .text.startup
.global _start
.code 32
.align 0

_start:
    // Set stack pointer
    ldr r3, =stack_top
    mov sp, r3
    // Allow VFP coprocessor access
    mrc p15, 0, r0, c1, c0, 2
    orr r0, r0, #0xF00000
    mcr p15, 0, r0, c1, c0, 2
    // Enable VFP
    mov r0, #0x40000000
    vmsr fpexc, r0
    // Jump to application
    bl kmain
    // In case the application returns, loop forever
    b .

"#
);

In debug profile, this compiles OK. If you use release profile and force codegen-units=1, it compiles. On armv8r-unknown-none-eabihf, it compiles.

But, if the target is armv7r-unknown-none-eabihf and codegen-units != 1, you get this error:

error: <inline asm>:18:5: instruction requires: VFP2
    vmsr fpexc, r0
    ^

Meta

rustc --version --verbose:

rustc 1.78.0 (9b00956e5 2024-04-29)
binary: rustc
commit-hash: 9b00956e56009bab2aa15d7bff10916599e3d6d6
commit-date: 2024-04-29
host: aarch64-apple-darwin
release: 1.78.0
LLVM version: 18.1.2

or

rustc 1.81.0-nightly (6b0f4b5ec 2024-06-24)
binary: rustc
commit-hash: 6b0f4b5ec3aa707ecaa78230722117324a4ce23c
commit-date: 2024-06-24
host: aarch64-apple-darwin
release: 1.81.0-nightly
LLVM version: 18.1.7

Both have the same issue.

chrisnc commented 4 months ago

The same issue happens with riscv32imac-unknown-none-elf, when trying to use the "A" extension in global_asm!, so this does not seem to be an issue with a specific target, but rather how rustc handles target features for global_asm!. Adding .option arch, rv32imac makes the error go away.

$ cargo build --release
   Compiling qemu-armv7r v0.1.0 (/Users/chrisnc/src/armv7r-issues)
error: <inline asm>:7:5: instruction requires the following: 'A' (Atomic Instructions)
    lr.w t0, 0(t1)
    ^
Dirbaio commented 4 months ago

minimized:

#![no_std]

core::arch::global_asm!(
    r#"
.section .text.startup
.global _start
.code 32
.align 0

_start:
    vmsr fpexc, r0
"#
);

works: 'rustc --edition=2021 --crate-type lib --target armv7r-none-eabihf repro.rs -C opt-level=0 -C embed-bitcode=no' fails: 'rustc --edition=2021 --crate-type lib --target armv7r-none-eabihf repro.rs -C opt-level=0' fails: 'rustc --edition=2021 --crate-type lib --target armv7r-none-eabihf repro.rs -C opt-level=s -C embed-bitcode=no'

so both opt-level and embed-bitcode=no affect it. huh

jamesmunns commented 4 months ago

I have a half-baked (read: totally uninformed goose chase) that LLVM might not be properly copying the target features when creating the TargetMachine codegen.

Following this down:

The last one says:

// SAFETY: llvm::LLVMRustCreateTargetMachine copies pointed to data

But:

https://github.com/rust-lang/rust/blob/c872a1418a4be3ea84a8d5232238b60d35339ba9/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp#L405-L530

doesn't do the copying. I'm trying to hunt down where in LLVM this copy would actually take place, in the createTargetMachine code.

jamesmunns commented 4 months ago

@Dirbaio tried leaking the feature flags, so it's probably not the "llvm doesn't copy the data right" thing I was guessing. Leaving the breadcrumbs in case it's useful for anyone following the codegen process down.

thejpster commented 4 months ago

This is possibly a dupe of https://github.com/rust-lang/rust/issues/80608

Dirbaio commented 4 months ago

i've narrowed it to this line. If that runs, compilation fails.

https://github.com/rust-lang/rust/blob/7ba82d61eb519c9c8cb8c47a3030a2bd2faaa186/compiler/rustc_codegen_llvm/src/back/write.rs#L720

so it's LTO-related, yep. Seems similar to https://github.com/rust-lang/rust/issues/80608 though the compilation does abort here. Probably root cause is https://github.com/llvm/llvm-project/issues/61991 too.

Dirbaio commented 4 months ago

narrowed it down to https://github.com/rust-lang/llvm-project/blob/96aca7c51701f9b3c5dd8567fcddf29492008e6d/llvm/lib/Object/ModuleSymbolTable.cpp#L96

the target features string there is empty. If I hardcode it to "+vfp3d16" the error goes away, that confirms the issue is there.

chrisnc commented 4 months ago

As in https://github.com/rust-lang/rust/issues/80608#issuecomment-1094267279, the workaround is to add assembly directives in the global_asm! block to enable target features. In this case it would be .fpu vfpv3-d16, which armv7r-none-eabihf enables by default.