llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.08k stars 11.1k forks source link

missed optimisation combining >= while > is ok? #62701

Open adamse opened 1 year ago

adamse commented 1 year ago

Exhibited in Rust when doing some parsing: https://rust.godbolt.org/z/xv1WjY55o

Someone had this to say about the issue (referencing https://godbolt.org/z/YW6b4fMeP which puts it in LLVM IR terms):

I think I see what's happening. The >= code is hitting InstCombinerImpl::foldICmpAddConstant, where it runs this transformation:

  // X+C >u C2 -> (X & ~C2) != C
  //   iff C & C2 == 0
  //       C2+1 is a power of 2

which turns

  %A = add i64 %val, -4
  %B = icmp ugt i64 %A, 3

=>
  %0 = and i64 %val, -4
  %B = icmp ne i64 %0, 4

But this later impacts InstCombinerImpl::foldAndOrOfICmpsUsingRanges(), which can fold icmps that involve adds, but not ands. The > version of the code doesn't hit the first fold, so the add/icmp doesn't get turned into an and/icmp, meaning the second fold runs, which is what actually combines the two icmps into one

gandhi56 commented 9 months ago

Disabling the following optimization under foldICmpAddConstant generates the intended output:

  // X+C >u C2 -> (X & ~C2) != C
  //   iff C & C2 == 0
  //       C2+1 is a power of 2

I wonder if we could tighten the conditions for this transformation to occur? Otherwise, the only other way is to defer this transformation which feels unnatural to do. @spatel-gh thoughts?

rotateright commented 9 months ago

I haven't been active on LLVM for a few months, so cc'ing @nikic @goldsteinn @RKSimon for more up-to-date feedback.

Avoiding other canonicalizations is not a complete solution (the source might be in the form with an 'and' already), and deferring is not ideal because the IR can clearly be reduced.

You probably need to add a transform that can handle and-of-icmps with a neg-pow2-masked-value. That is, match the generalized version of this:
https://alive2.llvm.org/ce/z/6GK9Xz

It might be worth finding where the 'eq' version of the pattern is reduced to see if it can be updated to also handle 'ne': https://alive2.llvm.org/ce/z/do5rN2