llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.4k stars 11.74k forks source link

[CVP] Missed optimization for smax((a + 7) / 8, 1) where a s> 0 #72378

Open XChy opened 11 months ago

XChy commented 11 months ago

Alive2 proof: https://alive2.llvm.org/ce/z/Cyoo2Y Missed example: https://godbolt.org/z/9zW8PToPe

In this example, smax((a + 7) / 8, 1) should be folded to (a + 7) / 8 based on control flow context, but not.

Real-world motivation: This snippet of IR is derived from FFmpeg/.../extr_aacdec.c_latm_decode_audio_specific_config.c (after O3 pipeline) The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR. See also: https://godbolt.org/z/4a6K4b8hc

Let me know if you can confirm that it's an optimization opportunity, thanks.

antoniofrighetto commented 10 months ago

Looks a valid optimization to me. Not sure if, in principle, this should live in CVP or somewhere in ConstantPropagation.

XChy commented 10 months ago

Looks a valid optimization to me. Not sure if, in principle, this should live in CVP or somewhere in ConstantPropagation.

@antoniofrighetto thanks for feedback! I've changed the title until the pass to blame is specified.

XChy commented 10 months ago

I think it's the issue of CVP and LVI, in processMinMaxIntrinsic of CVP, we don't get the predicate between LHS and RHS at their uses, therefore becoming imprecise. We should add something like getPredicateAtUse in LVI and then use it in CVP.

nikic commented 10 months ago

@XChy I don't think that would be sufficient here. The issue is that %div is defined outside the branch condition and will use the value of %a without the condition. Even if you use AtUse API, it will not be able to constrain the result of %div (as the condition is on %a, not %div).

XChy commented 10 months ago

@nikic You're right, I missed that we can constrain only one value at a time now. For one-use example in this issue, an easy solution is to sink these one-use instructions as SinkPass does. But for multi-use scenario in real world, that's a bit tricky to handle.