llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.62k stars 11.83k forks source link

conditional branch to unconditional tailcall not folded to conditional tailcall #49474

Closed nickdesaulniers closed 3 years ago

nickdesaulniers commented 3 years ago
Bugzilla Link 50130
Resolution DUPLICATE
Resolved on Apr 27, 2021 04:52
Version trunk
OS All
CC @chandlerc,@topperc,@zmodem,@haberman,@qcolombet

Extended Description

The blog post https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html and comment thread https://news.ycombinator.com/item?id=26934616#26937585 point to a test case (reduced):

target triple = "x86_64-unknown-linux-gnu"

declare dso_local i8* @​foo(i8*, i64)
declare dso_local i8* @​bar(i8*, i64)

define i8* @​baz(i8* %ptr, i64 %data) {
; CHECK-LABEL: baz:
; CHECK:       # %bb.0: # %entry
; CHECK-NEXT:    testb %sil, %sil
; CHECK-NEXT:    je foo # TAILCALL
; CHECK-NEXT:  # %bb.1: # %if.end
; CHECK-NEXT:    addq $5, %rdi
; CHECK-NEXT:    jmp bar # TAILCALL
entry:
  %and = and i64 %data, 255
  %tobool.not = icmp eq i64 %and, 0
  br i1 %tobool.not, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  %call = musttail call i8* @​foo(i8* %ptr, i64 %data)
  ret i8* %call

if.end:                                           ; preds = %entry
  %add.ptr4 = getelementptr inbounds i8, i8* %ptr, i64 5
  %call5 = musttail call i8* @​bar(i8* nonnull %add.ptr4, i64 %data)
  ret i8* %call5
}

run through llc will instead produce:

baz:
  # %bb.0:
  testb   %sil, %sil
  je      .LBB0_2
  # %bb.1: # %if.then
  jmp     foo
.LBB0_2: # %if.end
  addq    $5, %rdi
  jmp     bar

though we could have produced:

baz:
  # %bb.0: # %entry
  testb %sil, %sil
  je foo # TAILCALL
  # %bb.1: # %if.end
  addq $5, %rdi
  jmp bar # TAILCALL

Some code added in D29856 in llvm/lib/CodeGen/BranchFolding.cpp looks like it could do the folding. Quick experimentation with removing the guards:

  1. OptForSize
  2. PredTBB == MBB

enables this optimizations, but seems to regress quite a few tests: Failed Tests (29): LLVM :: CodeGen/X86/add.ll LLVM :: CodeGen/X86/atom-pad-short-functions.ll LLVM :: CodeGen/X86/avx512-i1test.ll LLVM :: CodeGen/X86/bmi.ll LLVM :: CodeGen/X86/brcond.ll LLVM :: CodeGen/X86/btq.ll LLVM :: CodeGen/X86/cmp.ll LLVM :: CodeGen/X86/conditional-tailcall-pgso.ll LLVM :: CodeGen/X86/conditional-tailcall.ll LLVM :: CodeGen/X86/copy-eflags.ll LLVM :: CodeGen/X86/extern_weak.ll LLVM :: CodeGen/X86/fold-rmw-ops.ll LLVM :: CodeGen/X86/fp-strict-scalar-cmp.ll LLVM :: CodeGen/X86/funnel-shift.ll LLVM :: CodeGen/X86/neg_cmp.ll LLVM :: CodeGen/X86/or-branch.ll LLVM :: CodeGen/X86/peep-test-4.ll LLVM :: CodeGen/X86/pr37063.ll LLVM :: CodeGen/X86/rd-mod-wr-eflags.ll LLVM :: CodeGen/X86/sibcall.ll LLVM :: CodeGen/X86/slow-incdec.ll LLVM :: CodeGen/X86/sqrt-partial.ll LLVM :: CodeGen/X86/switch-bt.ll LLVM :: CodeGen/X86/tail-call-conditional.mir LLVM :: CodeGen/X86/tail-opts.ll LLVM :: CodeGen/X86/tailcall-cgp-dup.ll LLVM :: CodeGen/X86/tailcall-extract.ll LLVM :: CodeGen/X86/xor-icmp.ll LLVM :: DebugInfo/COFF/pgo.ll

Some of these are straightforward fixes that I think make sense, but others like llvm/test/CodeGen/X86/conditional-tailcall.ll look quite wrong (branching the wrong way, IIUC)!

zmodem commented 3 years ago

This bug has been marked as a duplicate of bug llvm/llvm-project#49469

nickdesaulniers commented 3 years ago

Sorry, I think that should have been:

though we could have produced:

baz:
  # %bb.0: # %entry
  testb %sil, %sil
  jne foo # TAILCALL  <== JNE
  # %bb.1: # %if.end
  addq $5, %rdi
  jmp bar # TAILCALL