llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.22k stars 11.65k forks source link

Regression with `builtin_expect` over multiple predicates #32683

Open llvmbot opened 7 years ago

llvmbot commented 7 years ago
Bugzilla Link 33336
Version trunk
OS Linux
Attachments Reproducing case
Reporter LLVM Bugzilla Contributor
CC @TNorthover

Extended Description

Testing using an UNLIKELY macro built around builtin_expect found a regression from 3.7 -> 3.8 that is still present in the 4.0 and likely trunk (untested) branches of llvm. The unlikely condition is not propagated to both sub-expressions when combined with an OR clause. Using multiple invocations of such a macro gives the expected result.

Apparently the branch probabilities are no longer being propagated correctly to child expressions.

I bisected llvm and clang in parallel on release_37 and release_38 branches to find the responsible change and found this change to be responsible. Seems entirely plausible as it introduces a new branch:

[JumpThreading] Split select that has constant conditions coming from the PHI node

Look for PHI/Select in the same BB of the form

bb:
  %p = phi [false, %bb1], [true, %bb2], [false, %bb3], [true, %bb4], ...
  %s = select p, trueval, falseval

And expand the select into a branch structure. This later enables
jump-threading over bb in this pass.

Using the similar approach of SimplifyCFG::FoldCondBranchOnPHI(), unfold
select if the associated PHI has at least one constant.  If the unfolded
select is not jump-threaded, it will be folded again in the later
optimizations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257198 91177308-0d34-0410-b5e6-96231b3b80d

Attached code which generates a reproduction - you can see the sub-optimal branch generated in the asm output. Let me know if you want IR output as well, but this reproduces quite easily. I copied my reproducing case verbatim, there is almost certainly a more minimal case that doesn't involve floating point, but this works reliably - Godbolt link: https://godbolt.org/g/vBwwQU

llvmbot commented 7 years ago

Minimal fix I have a minimal fix that addresses this limited case, but may not be sufficient to address the larger issue (if present), which is what happens in the more free-form case of

if (UNLIKELY(x || y || z || ...) { /.../ } // Execution continues...

llvmbot commented 7 years ago

It looks like the problem happens after CFG simplification. We change the control flow for the last block to:

lor.end: ; preds = %lor.rhs, %entry %0 = phi i1 [ true, %entry ], [ %cmp, %lor.rhs ] %. = select i1 %0, i1 true, i1 false ret i1 %.

This in turn gets jump-threaded, which does not seem advantageous, since the target in this case is simply a return.