llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.64k stars 11.37k forks source link

[ARMv6M] missing tail call opportunities and unneeded instr emitted #30575

Open llvmbot opened 7 years ago

llvmbot commented 7 years ago
Bugzilla Link 31227
Version trunk
OS Linux
Reporter LLVM Bugzilla Contributor
CC @efriedma-quic

Extended Description

test.c:

__attribute__((noinline)) static int foo(int a) {
  return a*10;
}

int caller(int b) {
  b += 10;
  return foo(b);
}
clang -mcpu=cortex-m0 -target armv6m-linux-gnueabi -S -Os test.c -o test.llvm.s

test.llvm.s:

caller:
    .fnstart
    push    {r7, lr}
    add r7, sp, #​0  ==> why do we need this?
    adds    r0, #​10
    bl  foo
    pop {r7, pc}

Other compilers generate:

00000006 <caller>:
   6:   300a        adds    r0, #&#8203;10
   8:   e7fe        b.n 0 <foo>

LLVM generated code has 2 problems:

  1. unneeded instruction "add r7, sp, #​0" is emitted
  2. Other compiler can do tail call. on armv6m, "b" can be thumb2 with imm11 but still not a big range. Maybe linker can do trampoline?
efriedma-quic commented 7 years ago

From ARMSubtarget.cpp:

// FIXME: Completely disable sibcall for Thumb1 since ThumbRegisterInfo:: // emitEpilogue is not ready for them. Thumb tail calls also use t2B, as // the Thumb1 16-bit unconditional branch doesn't have sufficient relocation // support in the assembler and linker to be used. This would need to be // fixed to fully support tail calls in Thumb1. // // Doing this is tricky, since the LDM/POP instruction on Thumb doesn't take // LR. This means if we need to reload LR, it takes an extra instructions, // which outweighs the value of the tail call; but here we don't know yet // whether LR is going to be used. Probably the right approach is to // generate the tail call here and turn it back into CALL/RET in // emitEpilogue if LR is used.

llvmbot commented 7 years ago

"add r7, sp, #​0" is to establish frame. It's OK.

llvmbot commented 1 month ago

@llvm/issue-subscribers-backend-arm

Author: None (llvmbot)

| | | | --- | --- | | Bugzilla Link | [31227](https://llvm.org/bz31227) | | Version | trunk | | OS | Linux | | Reporter | LLVM Bugzilla Contributor | | CC | @efriedma-quic | ## Extended Description test.c: __attribute__((noinline)) static int foo(int a) { return a*10; } int caller(int b) { b += 10; return foo(b); } clang -mcpu=cortex-m0 -target armv6m-linux-gnueabi -S -Os test.c -o test.llvm.s test.llvm.s: caller: .fnstart push {r7, lr} add r7, sp, #&#8203;0 ==> why do we need this? adds r0, #&#8203;10 bl foo pop {r7, pc} Other compilers generate: 00000006 <caller>: 6: 300a adds r0, #&#8203;10 8: e7fe b.n 0 <foo> LLVM generated code has 2 problems: 1. unneeded instruction "add r7, sp, #&#8203;0" is emitted 2. Other compiler can do tail call. on armv6m, "b" can be thumb2 with imm11 but still not a big range. Maybe linker can do trampoline?