rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
96.7k stars 12.49k forks source link

asm! doesn't accept `offset` syntax, but gcc does #79874

Open joshtriplett opened 3 years ago

joshtriplett commented 3 years ago

In Intel-syntax assembly, to refer to the address of a symbol rather than the memory pointed to by that symbol, you have to write offset symbol rather than just symbol. For instance, mov rax, 1f is equivalent to mov rax, qword ptr 1f and moves the memory pointed to by 1f into rax, while mov rax, offset 1f moves the address of 1f into rax.

GCC and GAS accept this syntax. However, clang and Rust do not.

(The examples below use position-dependent addressing for simplicity. Production code should use position-independent addressing.)

/tmp$ cat offset.c 
#include <stdio.h>

int main() {
    unsigned value = 0;
    __asm__ __volatile__(
        "mov %0, 1\n"
        "mov rax, offset 3f\n"
        "jmp rax\n"
        "mov %0, 2\n"
        "3:"
        : "=r" (value)
        :
        : "rax"
    );
    printf("%u\n", value);
    return 0;
}

/tmp$ gcc -no-pie -masm=intel -O3 offset.c -o offset
/tmp$ ./offset 
1
/tmp$ clang-12 -no-pie -masm=intel -O3 offset.c -o offset
offset.c:7:10: error: unexpected token in argument list
        "mov rax, offset 3f\n"
         ^
<inline asm>:2:17: note: instantiated into assembly here
mov rax, offset 3f
                ^
1 error generated.
(1) /tmp$ cat offset.rs 
#![feature(asm)]

fn main() {
    let mut value: u64;
    unsafe {
        asm!(
          "mov {value}, 1",
          "mov rax, offset 3f",
          "jmp rax",
          "mov {value}, 2",
          "3:",
          value = out(reg) value,
          out("rax") _,
        );
    }
    dbg!(value);
}
/tmp$ rustc +nightly offset.rs 
error: unexpected token!
 --> offset.rs:8:12
  |
8 |           "mov rax, offset 3f",
  |            ^
  |
note: instantiated into assembly here
 --> <inline asm>:3:17
  |
3 | mov rax, offset 3f
  |                 ^

error: aborting due to previous error

(1) /tmp$ rustc +nightly --version
rustc 1.50.0-nightly (1c389ffef 2020-11-24)
joshtriplett commented 3 years ago

It looks like this is https://bugs.llvm.org/show_bug.cgi?id=32530

jyn514 commented 3 years ago

A workaround is to use lea instead of mov.

joshtriplett commented 3 years ago

That's true, but lea has disadvantages as well: you can't lea into a memory destination, whereas you can mov into a memory destination as long as you have a constant source. (Though you can't move a 64-bit immediate directly to memory, which makes that a moot point for 64-bit addresses.)

evmar commented 1 month ago

I ran into a variant of this and just wanted to update on behavior.

(Note that C on godbolt you need to twiddle the 'link to binary' checkbox for it to actually assemble the asm)

With clang version 17.0.1 (godbolt):

With gcc 14.2 (godbolt):

With Rust 1.80.0 (godbolt):

So I think at this point Clang matches gcc (at least in the case where it accepts), while Rust reading a label address is inexpressible.

(This came up in my win32 emulator here, in the code that manages 64-bit => 32-bit transitions, where I intentionally building a PIE-less binary because I need careful control over the memory layout of the first 4gb of address space.)

evmar commented 1 month ago

It looks like this is https://bugs.llvm.org/show_bug.cgi?id=32530

BTW, this bug is fixed upstream. This bug was about the offset syntax in general, which appears to now be accepted by Rust:

        "mov rax, {x}",  // value of x
        "mov rax, offset {x}",  // address of x

However it appears local labels somehow behave differently.