Open TheThief opened 3 years ago
mentioned in issue llvm/llvm-bugzilla-archive#51000
The relevant code is X86TargetLowering::IsEligibleForTailCallOptimization , specifically the call to MatchingStackOffset. Not sure what the code is trying to do.
Interestingly, if [[clang::musttail]] is added to the return statement, it generates the expected codegen:
foo(int, int, int, int, int, int, int): # @​foo(int, int, int, int, int, int, int)
mov eax, dword ptr [rsp + 8]
add eax, 7
mov dword ptr [rsp + 8], eax
jmp bar(int, int, int, int, int, int, int) # TAILCALL
This proves that clang/llvm is capable of generating the optimal code, it just fails to do so automatically for some reason.
Extended Description
While discussing clang's [[musttail]] attribute on Reddit, we discovered a case where clang/llvm doesn't produce a tail call and GCC and MSVC both do:
https://gcc.godbolt.org/z/doYGdG1dT
Clang output:
Additionally, it seems to oddly push rax at the start of the function, but restore it into rcx at the end. It seems likely that this saved register is why it's not performing the tail call, but it shouldn't be trying to preserve rax in the first place!
Using godbolt to try on older iterations of clang suggests that this codegen issue is very old indeed.