Open Quuxplusone opened 3 years ago
Bugzilla Link | PR48353 |
Status | NEW |
Importance | P enhancement |
Reported by | Congzhe Cao (congzhecao@gmail.com) |
Reported on | 2020-12-01 14:16:31 -0800 |
Last modified on | 2020-12-22 12:07:02 -0800 |
Version | trunk |
Hardware | PC Linux |
CC | juneyoung.lee@sf.snu.ac.kr, lebedev.ri@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, meheff@google.com, nunoplopes@sapo.pt, regehr@cs.utah.edu, spatel+llvm@rotateright.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also | PR48435 |
So the transform that is causing miscompile is
----------------------------------------
define i32 @src(i32 %aj) {
%entry:
%cmp.i = icmp sgt i32 %aj, 1
%aj.op = lshr i32 3, %aj
%.op2 = and i32 %aj.op, 1
%cmp41 = icmp ne i32 %.op2, 0
%cmp4 = select i1 %cmp.i, i1 1, i1 %cmp41
%conv5 = zext i1 %cmp4 to i32
ret i32 %conv5
}
=>
define i32 @tgt(i32 %aj) {
%entry:
%cmp.i = icmp sgt i32 %aj, 1
%aj.op = lshr i32 3, %aj
%.op2 = and i32 %aj.op, 1
%cmp41 = icmp ne i32 %.op2, 0
%cmp4 = or i1 %cmp.i, %cmp41
%conv5 = zext i1 %cmp4 to i32
ret i32 %conv5
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source
Example:
i32 %aj = #x00000020 (32)
Source:
i1 %cmp.i = #x1 (1)
i32 %aj.op = poison
i32 %.op2 = poison
i1 %cmp41 = poison
i1 %cmp4 = #x1 (1)
i32 %conv5 = #x00000001 (1)
Target:
i1 %cmp.i = #x1 (1)
i32 %aj.op = poison
i32 %.op2 = poison
i1 %cmp41 = poison
i1 %cmp4 = poison
i32 %conv5 = poison
Source value: #x00000001 (1)
Target value: poison
I.e.
----------------------------------------
define i1 @src(i1 %cmp.i, i1 %cmp41) {
%entry:
%cmp4 = select i1 %cmp.i, i1 1, i1 %cmp41
ret i1 %cmp4
}
=>
define i1 @tgt(i1 %cmp.i, i1 %cmp41) {
%entry:
%cmp4 = or i1 %cmp.i, %cmp41
ret i1 %cmp4
}
Transformation doesn't verify!
ERROR: Target is more poisonous than source
Example:
i1 %cmp.i = #x1 (1)
i1 %cmp41 = poison
Source:
i1 %cmp4 = #x1 (1)
Target:
i1 %cmp4 = poison
Source value: #x1 (1)
Target value: poison
... time to apply freeze?
Would it be a big change if optimizations that deal with or/and are updated to deal with the corresponding select patterns?
I tend to agree. I don't see much value in turning select -> and/or + freeze. I
would rather keep the select and make sure the backends are happy with it.
I think Sanjay already did some work in SelDAG to make it like selects more
AFAIR.
Yes, we've done some work in codegen to make it more robust with select in IR.
We've also tried to patch instcombine to make it more flexible about
recognizing patterns with selects.
If I'm seeing it correctly, this is one of the transforms noted here:
https://reviews.llvm.org/D72396
So we need to remove this block:
https://github.com/llvm/llvm-project/blob/f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L2627
And see what regresses?
> So we need to remove this block:
> https://github.com/llvm/llvm-project/blob/
> f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/
> InstCombineSelect.cpp#L2627
>
> And see what regresses?
Sounds like a plan! :)
(that whole block seems incorrect, yes)
(In reply to Sanjay Patel from comment #5)
> Yes, we've done some work in codegen to make it more robust with select in
> IR. We've also tried to patch instcombine to make it more flexible about
> recognizing patterns with selects.
>
> If I'm seeing it correctly, this is one of the transforms noted here:
> https://reviews.llvm.org/D72396
>
> So we need to remove this block:
> https://github.com/llvm/llvm-project/blob/
> f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/
> InstCombineSelect.cpp#L2627
>
> And see what regresses?
Thanks for the suggestion! (In reply to Sanjay Patel from comment #5)
> Yes, we've done some work in codegen to make it more robust with select in
> IR. We've also tried to patch instcombine to make it more flexible about
> recognizing patterns with selects.
>
> If I'm seeing it correctly, this is one of the transforms noted here:
> https://reviews.llvm.org/D72396
>
> So we need to remove this block:
> https://github.com/llvm/llvm-project/blob/
> f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/
> InstCombineSelect.cpp#L2627
>
> And see what regresses?
Thanks for the suggestion! Are you going to upstream it? We are going to
internally implement a patch as suggested and measure the performance anyways,
so we could just post the bug/solution/result to Phabricator if that makes
things easier. I'd appreciate your comment on it.
(In reply to Nuno Lopes from comment #4)
> I tend to agree. I don't see much value in turning select -> and/or +
> freeze. I would rather keep the select and make sure the backends are happy
> with it.
> I think Sanjay already did some work in SelDAG to make it like selects more
> AFAIR.
Thanks for the comment. I'm just curious what would be the potential solution
of "turning select -> and/or + freeze", what does it look like? Is it like just
freezing the result of and/or instruction?
The solution that uses freeze is to insert freeze at the operand or and/or
coming from non-conditional value of select:
select x, true, y -> or x, freeze(y)
select x, y, false -> and x, freeze(y)
Here is the test: https://alive2.llvm.org/ce/z/dwFA_n
(In reply to Juneyoung Lee from comment #10)
> Here is the test: https://alive2.llvm.org/ce/z/dwFA_n
Thanks for the demo! it makes perfect sense.
(In reply to Congzhe Cao from comment #7)
> (In reply to Sanjay Patel from comment #5)
> > Yes, we've done some work in codegen to make it more robust with select in
> > IR. We've also tried to patch instcombine to make it more flexible about
> > recognizing patterns with selects.
> >
> > If I'm seeing it correctly, this is one of the transforms noted here:
> > https://reviews.llvm.org/D72396
> >
> > So we need to remove this block:
> > https://github.com/llvm/llvm-project/blob/
> > f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/
> > InstCombineSelect.cpp#L2627
> >
> > And see what regresses?
>
> Thanks for the suggestion! (In reply to Sanjay Patel from comment #5)
> > Yes, we've done some work in codegen to make it more robust with select in
> > IR. We've also tried to patch instcombine to make it more flexible about
> > recognizing patterns with selects.
> >
> > If I'm seeing it correctly, this is one of the transforms noted here:
> > https://reviews.llvm.org/D72396
> >
> > So we need to remove this block:
> > https://github.com/llvm/llvm-project/blob/
> > f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/
> > InstCombineSelect.cpp#L2627
> >
> > And see what regresses?
>
> Thanks for the suggestion! Are you going to upstream it? We are going to
> internally implement a patch as suggested and measure the performance
> anyways, so we could just post the bug/solution/result to Phabricator if
> that makes things easier. I'd appreciate your comment on it.
Sorry, I missed that comment initially. If you can collect some perf
measurements with that change and/or post to Phab, that would be great.
I just looked at some of the instcombine regression test diffs, and we may need
to accelerate the canonicalization to the min/max intrinsics to avoid problems.
But we should check codegen for various targets to see if those are really
regressions that we must address.
(In reply to Sanjay Patel from comment #12)
> (In reply to Congzhe Cao from comment #7)
> > (In reply to Sanjay Patel from comment #5)
> > > Yes, we've done some work in codegen to make it more robust with select in
> > > IR. We've also tried to patch instcombine to make it more flexible about
> > > recognizing patterns with selects.
> > >
> > > If I'm seeing it correctly, this is one of the transforms noted here:
> > > https://reviews.llvm.org/D72396
> > >
> > > So we need to remove this block:
> > > https://github.com/llvm/llvm-project/blob/
> > > f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/
> > > InstCombineSelect.cpp#L2627
> > >
> > > And see what regresses?
> >
> > Thanks for the suggestion! (In reply to Sanjay Patel from comment #5)
> > > Yes, we've done some work in codegen to make it more robust with select in
> > > IR. We've also tried to patch instcombine to make it more flexible about
> > > recognizing patterns with selects.
> > >
> > > If I'm seeing it correctly, this is one of the transforms noted here:
> > > https://reviews.llvm.org/D72396
> > >
> > > So we need to remove this block:
> > > https://github.com/llvm/llvm-project/blob/
> > > f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/
> > > InstCombineSelect.cpp#L2627
> > >
> > > And see what regresses?
> >
> > Thanks for the suggestion! Are you going to upstream it? We are going to
> > internally implement a patch as suggested and measure the performance
> > anyways, so we could just post the bug/solution/result to Phabricator if
> > that makes things easier. I'd appreciate your comment on it.
>
> Sorry, I missed that comment initially. If you can collect some perf
> measurements with that change and/or post to Phab, that would be great.
>
> I just looked at some of the instcombine regression test diffs, and we may
> need to accelerate the canonicalization to the min/max intrinsics to avoid
> problems. But we should check codegen for various targets to see if those
> are really regressions that we must address.
Thanks for the reply! I followed the suggestion that removes this block:
https://github.com/llvm/llvm-project/blob/f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L2627.
But then the built clang crashes when building some tests. The cause is that,
since that block has been removed, when we now process instructions like
"select i1 Condition, i1 TrueValue, i1 FalseValue", we'll enter the following
code block to optimize it:
https://github.com/llvm/llvm-project/blob/f03c21df7b845e2ffcef42f242764f36603fdbb4/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp#L2669
And this results in "zext i1 to i1", causing the compiler to crash.
My fix to the crash is that, in the condition checking (line 2669), adding the
following checks to make sure TrueVal and FalseVal are not i1 type, which now
looks like:
if (SelType->isIntOrIntVectorTy() &&
CondVal->getType()->isVectorTy() == SelType->isVectorTy() &&
(TrueVal->getType()->isIntegerTy() &&
TrueVal->getType()->getIntegerBitWidth() != 1) &&
(FalseVal->getType()->isIntegerTy() &&
FalseVal->getType()->getIntegerBitWidth() != 1))
I'm wondering if you have comments on this fix? If it looks fine I'll go ahead
measuring the performance. Thanks!
Posted the fix suggested to Phabricator as work-in-progress, will update the progress and performance numbers there. Any review is welcome, please just let me know if I happen to miss someone.
_Bug 48435 has been marked as a duplicate of this bug._