Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

conditional branch to unconditional tailcall not folded to conditional tailcall #49099

Closed Quuxplusone closed 3 years ago

Quuxplusone commented 3 years ago
Bugzilla Link PR50130
Status RESOLVED DUPLICATE of bug 50125
Importance P enhancement
Reported by Nick Desaulniers (ndesaulniers@google.com)
Reported on 2021-04-26 17:00:13 -0700
Last modified on 2021-04-27 04:52:05 -0700
Version trunk
Hardware PC All
CC chandlerc@gmail.com, craig.topper@gmail.com, hans@chromium.org, jhaberman@gmail.com, llvm-bugs@lists.llvm.org, quentin.colombet@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also PR50125

The blog post https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html and comment thread https://news.ycombinator.com/item?id=26934616#26937585 point to a test case (reduced):

target triple = "x86_64-unknown-linux-gnu"

declare dso_local i8* @foo(i8*, i64)
declare dso_local i8* @bar(i8*, i64)

define i8* @baz(i8* %ptr, i64 %data) {
; CHECK-LABEL: baz:
; CHECK:       # %bb.0: # %entry
; CHECK-NEXT:    testb %sil, %sil
; CHECK-NEXT:    je foo # TAILCALL
; CHECK-NEXT:  # %bb.1: # %if.end
; CHECK-NEXT:    addq $5, %rdi
; CHECK-NEXT:    jmp bar # TAILCALL
entry:
  %and = and i64 %data, 255
  %tobool.not = icmp eq i64 %and, 0
  br i1 %tobool.not, label %if.end, label %if.then

if.then:                                          ; preds = %entry
  %call = musttail call i8* @foo(i8* %ptr, i64 %data)
  ret i8* %call

if.end:                                           ; preds = %entry
  %add.ptr4 = getelementptr inbounds i8, i8* %ptr, i64 5
  %call5 = musttail call i8* @bar(i8* nonnull %add.ptr4, i64 %data)
  ret i8* %call5
}

run through llc will instead produce:

baz:
  # %bb.0:
  testb   %sil, %sil
  je      .LBB0_2
  # %bb.1: # %if.then
  jmp     foo
.LBB0_2: # %if.end
  addq    $5, %rdi
  jmp     bar

though we could have produced:

baz:
  # %bb.0: # %entry
  testb %sil, %sil
  je foo # TAILCALL
  # %bb.1: # %if.end
  addq $5, %rdi
  jmp bar # TAILCALL

Some code added in D29856 in llvm/lib/CodeGen/BranchFolding.cpp looks like it could do the folding. Quick experimentation with removing the guards:

  1. OptForSize
  2. PredTBB == MBB

enables this optimizations, but seems to regress quite a few tests: Failed Tests (29): LLVM :: CodeGen/X86/add.ll LLVM :: CodeGen/X86/atom-pad-short-functions.ll LLVM :: CodeGen/X86/avx512-i1test.ll LLVM :: CodeGen/X86/bmi.ll LLVM :: CodeGen/X86/brcond.ll LLVM :: CodeGen/X86/btq.ll LLVM :: CodeGen/X86/cmp.ll LLVM :: CodeGen/X86/conditional-tailcall-pgso.ll LLVM :: CodeGen/X86/conditional-tailcall.ll LLVM :: CodeGen/X86/copy-eflags.ll LLVM :: CodeGen/X86/extern_weak.ll LLVM :: CodeGen/X86/fold-rmw-ops.ll LLVM :: CodeGen/X86/fp-strict-scalar-cmp.ll LLVM :: CodeGen/X86/funnel-shift.ll LLVM :: CodeGen/X86/neg_cmp.ll LLVM :: CodeGen/X86/or-branch.ll LLVM :: CodeGen/X86/peep-test-4.ll LLVM :: CodeGen/X86/pr37063.ll LLVM :: CodeGen/X86/rd-mod-wr-eflags.ll LLVM :: CodeGen/X86/sibcall.ll LLVM :: CodeGen/X86/slow-incdec.ll LLVM :: CodeGen/X86/sqrt-partial.ll LLVM :: CodeGen/X86/switch-bt.ll LLVM :: CodeGen/X86/tail-call-conditional.mir LLVM :: CodeGen/X86/tail-opts.ll LLVM :: CodeGen/X86/tailcall-cgp-dup.ll LLVM :: CodeGen/X86/tailcall-extract.ll LLVM :: CodeGen/X86/xor-icmp.ll LLVM :: DebugInfo/COFF/pgo.ll

Some of these are straightforward fixes that I think make sense, but others like llvm/test/CodeGen/X86/conditional-tailcall.ll look quite wrong (branching the wrong way, IIUC)!

Quuxplusone commented 3 years ago

Sorry, I think that should have been:

though we could have produced:

baz:
  # %bb.0: # %entry
  testb %sil, %sil
  jne foo # TAILCALL  <== JNE
  # %bb.1: # %if.end
  addq $5, %rdi
  jmp bar # TAILCALL
Quuxplusone commented 3 years ago

_This bug has been marked as a duplicate of bug 50125_