rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.32k stars 12.46k forks source link

Inline assembly fails to compile after allowing inlining on the enclosing function for i686-pc-windows-msvc #106781

Open johnmave126 opened 1 year ago

johnmave126 commented 1 year ago

Consider the following code:

use std::arch::asm;

#[inline(never)]
pub fn my_test(a: u32, b: u32, c: u32, d: u32) -> u32 {
    let mut g = 3;
    unsafe {
        asm!(
            "mov {a}, {b}",
            "xor {a}, {c}",
            "and {a}, {d}",
            "or {a}, {e}",
            "not {f}",
            "add {a}, {f}",
            "sub {a}, {g}",
            "mov {g}, {a}",

            a = in(reg) a,
            b = in(reg) b,
            c = in(reg) c,
            d = in(reg) d,
            e = out(reg) _,
            f = out(reg) _,
            g = inout(reg) g
        )
    }
    g
}

which uses 7 registers. If we try to compile this on i686-pc-windows-msvc, in principle there are 7 general registers, but Rust Reference mentions that ebp and esi would be reserved, so I expect a compilation error.

We can put this in a lib.rs, and in main.rs we put:

use asm_test::my_test;

fn main() {
    println!("{}", my_test(1, 2, 3, 4));
}

Instead, when compiling in release mode using cargo build --release --target i686-pc-windows-msvc, it compiles. cargo asm --lib --target i686-pc-windows-msvc reveals that the compiler does allocate esi and ebp.

cargo asm output ```asm .section .text,"xr",one_only,asm_test::my_test .globl asm_test::my_test .p2align 4, 0x90 asm_test::my_test: Lfunc_begin0: .cv_func_id 0 .cv_file 1 "R:\\asm-test\\src\\lib.rs" "D7E84472A8BA4CD0C091D8930D59F9C5C684E1A6" 2 .cv_loc 0 1 3 0 .cv_fpo_proc asm_test::my_test 16 push ebp .cv_fpo_pushreg ebp push ebx .cv_fpo_pushreg ebx push edi .cv_fpo_pushreg edi push esi .cv_fpo_pushreg esi .cv_fpo_endprologue mov ecx, dword ptr [esp + 20] mov edx, dword ptr [esp + 24] mov esi, dword ptr [esp + 28] mov edi, dword ptr [esp + 32] mov eax, 3 .cv_loc 0 1 6 0 #APP mov ecx, edx xor ecx, esi and ecx, edi or ecx, ebx not ebp add ecx, ebp sub ecx, eax mov eax, ecx #NO_APP .cv_loc 0 1 26 0 pop esi pop edi pop ebx pop ebp ret .cv_fpo_endproc ```

However, if we change #[inline(never)] to #[inline], the compiler correctly displays:

error: inline assembly requires more registers than available

I would expect a compilation error in both cases, and a guarantee that neither esi nor ebp gets allocated.

I haven't tried on x86_64 target yet simply because the number of general registers is a lot. I could try it later today.

Meta

Tried on both stable and nightly rustc --version --verbose:

rustc 1.66.1 (90743e729 2023-01-10)
binary: rustc
commit-hash: 90743e7298aca107ddaa0c202a4d3604e29bfeb6
commit-date: 2023-01-10
host: i686-pc-windows-msvc
release: 1.66.1
LLVM version: 15.0.2
rustc 1.68.0-nightly (1e4f90061 2023-01-11)
binary: rustc
commit-hash: 1e4f90061cc4bc566f99ab21b1f101182b10cf0c
commit-date: 2023-01-11
host: i686-pc-windows-msvc
release: 1.68.0-nightly
LLVM version: 15.0.6

No backtrace available.

asquared31415 commented 1 year ago

@rustbot +A-inline-assembly

I am fairly certain that the compiler is allowed to make that compile, since it seems to be saving and restoring the registers that it needs to preserve. We can't guarantee that it will be able to do this however.

speculation: The change in behavior due to inlining is likely a result of the compiler now having to allocate different registers (and running out) because of the surrounding context, without understanding that it could still preserve them?

johnmave126 commented 1 year ago

@rustbot +A-inline-assembly

I am fairly certain that the compiler is allowed to make that compile, since it seems to be saving and restoring the registers that it needs to preserve. We can't guarantee that it will be able to do this however.

speculation: The change in behavior due to inlining is likely a result of the compiler now having to allocate different registers (and running out) because of the surrounding context, without understanding that it could still preserve them?

I came across this: #84658, so I thought the compiler was designed to reject it. If I understand you correctly, the compiler only rejects using reserved register if we explicitly ask for inout("esi") but could still allocate it if we use inout(reg). Am I right?

asquared31415 commented 1 year ago

Correct. From the Reference:

The frame pointer and base pointer registers are reserved for internal use by LLVM. While asm! statements cannot explicitly specify the use of reserved registers, in some cases LLVM will allocate one of these reserved registers for reg operands. Assembly code making use of reserved registers should be careful since reg operands may use the same registers.

We can never allow you to require LLVM to use a reserved register, since its reserved, but if LLVM gives it up willingly for som general purpose use, that is perfectly fine.

asquared31415 commented 1 year ago

I still think there's something strange going on with the fact that it stops compiling, but the fact that LLVM chooses a reserved register for a (reg) operand is not a problem on its own.

johnmave126 commented 1 year ago

Gotcha, will change the title to a more appropriate one.