llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.24k stars 12.07k forks source link

Wrong code for inline assembly with `-masm=intel` on x86_64 #61640

Open lhmouse opened 1 year ago

lhmouse commented 1 year ago

Godbolt: https://gcc.godbolt.org/z/acP7Ee73b

using my_function = int (int, int);
my_function* my_fptr;

int
ptc_indirect_call(int a, int b)
  {
    return my_fptr(a, b);
  }

int
asm_indirect_call(int a, int b)
  {
    __asm__ ("jmp qword ptr [my_fptr]");
    __builtin_unreachable();
  }
ptc_indirect_call(int, int):                # @ptc_indirect_call(int, int)
        mov     rax, qword ptr [rip + my_fptr]
        rex64 jmp       rax                     # TAILCALL
asm_indirect_call(int, int):                # @asm_indirect_call(int, int)

        jmp     my_fptr

my_fptr:
        .quad   0

The inline asm statement gets compiled as a direct call and will jump to nonexecutable data. (the at&t syntax however doesn't suffer from this issue.)

lhmouse commented 1 year ago

It looks like

__asm__ ("jmp qword ptr [my_fptr]");

is not valid; I have to add the base register rip explicitly:

__asm__ ("jmp qword ptr [rip + my_fptr]");

Then the issue is the other way around: If the first piece of code is not valid, clang shall not have accepted it and generated something unspecified.

llvmbot commented 1 year ago

@llvm/issue-subscribers-backend-x86

Godbolt: https://gcc.godbolt.org/z/acP7Ee73b ```c++ using my_function = int (int, int); my_function* my_fptr; int ptc_indirect_call(int a, int b) { return my_fptr(a, b); } int asm_indirect_call(int a, int b) { __asm__ ("jmp qword ptr [my_fptr]"); __builtin_unreachable(); } ``` ```asm ptc_indirect_call(int, int): # @ptc_indirect_call(int, int) mov rax, qword ptr [rip + my_fptr] rex64 jmp rax # TAILCALL asm_indirect_call(int, int): # @asm_indirect_call(int, int) jmp my_fptr my_fptr: .quad 0 ``` The inline asm statement gets compiled as a direct call and will jump to nonexecutable data. (the at&t syntax however doesn't suffer from this issue.)