Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Missed Tail Call Optimization when specifying function return value with __builtin_unreachable or __builtin_assume #41977

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR43007
Status REOPENED
Importance P enhancement
Reported by Michael Kuklinski (mike.k@digitalcarbide.com)
Reported on 2019-08-15 09:56:23 -0700
Last modified on 2019-11-03 16:06:50 -0800
Version trunk
Hardware All All
CC david.bolvansky@gmail.com, llvm-bugs@lists.llvm.org, mike.k@digitalcarbide.com, nok.raven@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also PR13826
When explicitly specifying that a function can only return a specific value
using either __builtin_unreachable or __builtin_assume, the optimizer fails to
perform tail call optimization.

This is observed for all architectures.

Example:

extern int function_returns_only_1_or_doesnt_return(int, int);

int foo1(int a, int b) {
    const int result = function_returns_only_1_or_doesnt_return(a, b);
    if (result == 1) {
        return result;
    }
    else {
        __builtin_unreachable();
    }
}

int foo2(int a, int b) {
    const int result = function_returns_only_1_or_doesnt_return(a, b);
    __builtin_assume(result == 1);
    return result;
}

int foo3(int a, int b) {
    return function_returns_only_1_or_doesnt_return(a, b);
}

For the flags '-O3' for an x86-64 target, this emits the following assembly:

foo1(int, int): # @foo1(int, int)
  push rax
  call function_returns_only_1_or_doesnt_return(int, int)
  mov eax, 1
  pop rcx
  ret
foo2(int, int): # @foo2(int, int)
  push rax
  call function_returns_only_1_or_doesnt_return(int, int)
  mov eax, 1
  pop rcx
  ret
foo3(int, int): # @foo3(int, int)
  jmp function_returns_only_1_or_doesnt_return(int, int) # TAILCALL

It would be expected that all three would perform tail call optimization.
Quuxplusone commented 5 years ago

https://reviews.llvm.org/D66096 should fix it

Quuxplusone commented 5 years ago
Code assumes(result == 1) but we dont know if
'function_returns_only_1_or_doesnt_return' has side effects or not.

With __attribute__((const)):
extern int function_returns_only_1_or_doesnt_return(int, int)
__attribute__((const));

We get what we want:

define dso_local i32 @foo1(i32 %0, i32 %1) local_unnamed_addr #0 {
  ret i32 1
}

define dso_local i32 @foo2(i32 %0, i32 %1) local_unnamed_addr #0 {
  ret i32 1
}

so I believe this works as expected.
Quuxplusone commented 5 years ago

Eh, but you are right about performing tail call optimization.. Reopened.

Quuxplusone commented 4 years ago

Probably duplicate of bug 13826.