Open Quuxplusone opened 3 years ago
Bugzilla Link | PR49055 |
Status | NEW |
Importance | P enhancement |
Reported by | Simon Pilgrim (llvm-dev@redking.me.uk) |
Reported on | 2021-02-05 04:12:56 -0800 |
Last modified on | 2021-02-09 04:48:09 -0800 |
Version | trunk |
Hardware | PC Windows NT |
CC | craig.topper@gmail.com, lebedev.ri@gmail.com, llvm-bugs@lists.llvm.org, spatel+llvm@rotateright.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
The root question - which of these is canonical?
define i32 @src(i8 %x) {
%zx = zext i8 %x to i32
%s1 = shl nuw nsw i32 %zx, 8
%s2 = shl nuw nsw i32 %zx, 16
%or = or i32 %s1, %s2
ret i32 %or
}
define i32 @tgt(i8 %x) {
%zx = zext i8 %x to i32
%or = mul i32 %zx, 65792
ret i32 %or
}
https://alive2.llvm.org/ce/z/dqT39Y
The difference in instcombine is in
InstCombinerImpl::SimplifyUsingDistributiveLaws() ->
getBinOpsForFactorization().
We recognize add-of-shifts as a factorized multiply, but not or-of-shifts.
We're arguing for "mul" in this example, and we already do that partially, so
the backend should be equipped to decompose back to shift+logic.
Interestingly, the -reassociate pass already has a reverse transform from -
instcombine that turns the 'or' into 'add':
"ShouldConvertOrWithNoCommonBitsToAdd"
If we allow shift opcodes there, that's probably the quickest fix.
A little git digging turns up that the -reassociate logic was added relatively
recently:
https://reviews.llvm.org/rG70472f3
https://reviews.llvm.org/rG93f3d7f
So it seems fine to add 'shl' to the opcode list. It's awkward that we have
opposing canonicalizations in different passes, but I don't have any better
ideas.
I'll add a reassociate test and a phase ordering test to prevent this from
regressing.
The small fix to -reassociate improves, but does not solve the larger example
shown here:
https://reviews.llvm.org/rG6fd91be35444
We'll need to adjust reassociate and/or instcombine with at least 1 more change
to get this to canonicalize as expected.