Open XChy opened 11 months ago
Looks a valid optimization to me. Not sure if, in principle, this should live in CVP or somewhere in ConstantPropagation.
Looks a valid optimization to me. Not sure if, in principle, this should live in CVP or somewhere in ConstantPropagation.
@antoniofrighetto thanks for feedback! I've changed the title until the pass to blame is specified.
I think it's the issue of CVP and LVI, in processMinMaxIntrinsic
of CVP, we don't get the predicate between LHS and RHS at their uses, therefore becoming imprecise. We should add something like getPredicateAtUse
in LVI and then use it in CVP.
@XChy I don't think that would be sufficient here. The issue is that %div
is defined outside the branch condition and will use the value of %a
without the condition. Even if you use AtUse API, it will not be able to constrain the result of %div
(as the condition is on %a
, not %div
).
@nikic You're right, I missed that we can constrain only one value at a time now.
For one-use example in this issue, an easy solution is to sink these one-use instructions as SinkPass
does. But for multi-use scenario in real world, that's a bit tricky to handle.
Alive2 proof: https://alive2.llvm.org/ce/z/Cyoo2Y Missed example: https://godbolt.org/z/9zW8PToPe
In this example,
smax((a + 7) / 8, 1)
should be folded to(a + 7) / 8
based on control flow context, but not.Real-world motivation: This snippet of IR is derived from FFmpeg/.../extr_aacdec.c_latm_decode_audio_specific_config.c (after O3 pipeline) The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR. See also: https://godbolt.org/z/4a6K4b8hc
Let me know if you can confirm that it's an optimization opportunity, thanks.