llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.8k stars 11.91k forks source link

armv7-a thumb clang reorders ldr before the first inline asm in __attribute__((naked)) function #56567

Open k15tfu opened 2 years ago

k15tfu commented 2 years ago

Hi!

Compile the following code with -O2 -fPIC -march=armv7-a -mthumb:

void * this_ptr;
void (*call_enter_ptr)(void * this_);

__attribute__((naked))
void raw_call_enter() {
#if defined(__aarch64__)
    asm volatile(
        "stp x0, x1, [sp, #-16]!\n"
        // ...
        : : :
    );

    asm volatile("" : : : "memory"); // compiler barrier
    asm volatile(
        "ldr x0, %1\n"
        "blr %0\n"
        : : "r"(call_enter_ptr), "m"(this_ptr) : "x0"
    );

    asm volatile(
        // ...
        "ldp x0, x1, [sp], #16\n"
        : : :
    );

    asm volatile(
        "ret"
    );
#elif defined(__arm__)
    asm volatile(
        "push {r0-r3, r12, lr}\n"
        : : :
    );

    asm volatile("" : : : "memory"); // compiler barrier
    asm volatile(
        "ldr r0, %1\n"
        "blx %0\n"
        : : "r"(call_enter_ptr), "m"(this_ptr) : "r0"
    );

    asm volatile(
        "pop {r0-r3, r12, lr}\n"
        : : :
    );

    asm volatile(
        "bx lr"
    );
#endif
}

Output:

raw_call_enter():
        ldr     r0, .LCPI0_0
        ldr     r1, .LCPI0_1
        push.w  {r0, r1, r2, r3, r12, lr}  <-- and now r0 & r1 are invalid

.LPC0_0:
        add     r0, pc
.LPC0_1:
        add     r1, pc
        ldr     r0, [r0]
        ldr     r1, [r1]
        ldr     r2, [r0]
        ldr     r0, [r1]
        blx     r2

        pop.w   {r0, r1, r2, r3, r12, lr}

        bx      lr
.LCPI0_0:
.Ltmp1:
        .long   call_enter_ptr(GOT_PREL)-((.LPC0_0+4)-.Ltmp1)
.LCPI0_1:
.Ltmp2:
        .long   this_ptr(GOT_PREL)-((.LPC0_1+4)-.Ltmp2)
this_ptr:
        .long   0

call_enter_ptr:
        .long   0

armv7-a clang 11.0.1 https://godbolt.org/z/d9dMndvsj

llvmbot commented 2 years ago

@llvm/issue-subscribers-backend-arm

EugeneZelenko commented 2 years ago

Could you please try main branch? 11 is too old.

DavidSpickett commented 2 years ago

This is the same with main (b94ea8b3ebc16504b668fd7086de544637e0cd53).

I believe this is expected behaviour. I see the same thing happen with GCC. If we look at the docs for naked (https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html):

The only statements that can be safely included in naked functions are asm statements that do not have operands.

Arm Compiler For Embedded (aka Arm Compiler 6), which uses clang, also says the same thing (https://developer.arm.com/documentation/101754/0618/armclang-Reference/Compiler-specific-Function--Variable--and-Type-Attributes/--attribute----naked---function-attribute):

The compiler only supports basic __asm statements in __attribute__((naked)) functions. Using extended assembly, parameter references or mixing C code with __asm statements might not work reliably.

If you want to guarantee the output you'll have to recreate what the PIC code would do but in inline asm. Or you could lift the parts before the memory barrier into the callers, if all you care about is having the compiler generated instructions go after the memory barrier.