llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.92k stars 11.52k forks source link

Missed Tail Call Optimization when specifying function return value with __builtin_unreachable or __builtin_assume #42352

Open 90ab0e2f-5dbd-421b-a4bc-0d74b67c078b opened 5 years ago

90ab0e2f-5dbd-421b-a4bc-0d74b67c078b commented 5 years ago
Bugzilla Link 43007
Version trunk
OS All
CC @davidbolvansky,@ameisen,@Kojoley

Extended Description

When explicitly specifying that a function can only return a specific value using either __builtin_unreachable or __builtin_assume, the optimizer fails to perform tail call optimization.

This is observed for all architectures.

Example:

extern int function_returns_only_1_or_doesnt_return(int, int);

int foo1(int a, int b) { const int result = function_returns_only_1_or_doesnt_return(a, b); if (result == 1) { return result; } else { __builtin_unreachable(); } }

int foo2(int a, int b) { const int result = function_returns_only_1_or_doesnt_return(a, b); __builtin_assume(result == 1); return result; }

int foo3(int a, int b) { return function_returns_only_1_or_doesnt_return(a, b); }

For the flags '-O3' for an x86-64 target, this emits the following assembly:

foo1(int, int): # @​foo1(int, int) push rax call function_returns_only_1_or_doesnt_return(int, int) mov eax, 1 pop rcx ret foo2(int, int): # @​foo2(int, int) push rax call function_returns_only_1_or_doesnt_return(int, int) mov eax, 1 pop rcx ret foo3(int, int): # @​foo3(int, int) jmp function_returns_only_1_or_doesnt_return(int, int) # TAILCALL

It would be expected that all three would perform tail call optimization.

Kojoley commented 4 years ago

Probably duplicate of bug llvm/llvm-project#14198 .

davidbolvansky commented 5 years ago

Eh, but you are right about performing tail call optimization.. Reopened.

davidbolvansky commented 5 years ago

Code assumes(result == 1) but we dont know if 'function_returns_only_1_or_doesnt_return' has side effects or not.

With attribute((const)): extern int function_returns_only_1_or_doesnt_return(int, int) attribute((const));

We get what we want:

define dso_local i32 @​foo1(i32 %0, i32 %1) local_unnamed_addr #​0 { ret i32 1 }

define dso_local i32 @​foo2(i32 %0, i32 %1) local_unnamed_addr #​0 { ret i32 1 }

so I believe this works as expected.

davidbolvansky commented 5 years ago

https://reviews.llvm.org/D66096 should fix it