Open Quuxplusone opened 2 years ago
Bugzilla Link | PR52402 |
Status | CONFIRMED |
Importance | P enhancement |
Reported by | Sanjay Patel (spatel+llvm@rotateright.com) |
Reported on | 2021-11-04 07:01:56 -0700 |
Last modified on | 2021-11-05 12:45:00 -0700 |
Version | trunk |
Hardware | PC All |
CC | anton.a.afanasyev@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also | PR52289, PR40493 |
Here is the sample of similar nature, showing how we miss full folding at -O3:
define i1 @src(i32 %ta) {
%t3 = trunc i32 %ta to i8
%t4 = and i8 %t3, 8
%t6.not = icmp eq i8 %t4, 0
%t7 = select i1 %t6.not, i8 7, i8 0
%t8 = shl i8 %t4, %t7
%t10 = sext i8 %t8 to i32
%t12 = mul i32 %t10, 1355350016
%t13 = icmp eq i32 %t12, 65536
ret i1 %t13
}
define i1 @tgt(i32 %ta) {
ret i1 false
}
https://alive2.llvm.org/ce/z/pbFLS2
https://gcc.godbolt.org/z/5ooM4f4rr
(In reply to Anton Afanasyev from comment #1)
> Here is the sample of similar nature, showing how we miss full folding at
> -O3:
>
> define i1 @src(i32 %ta) {
> %t3 = trunc i32 %ta to i8
> %t4 = and i8 %t3, 8
> %t6.not = icmp eq i8 %t4, 0
> %t7 = select i1 %t6.not, i8 7, i8 0
> %t8 = shl i8 %t4, %t7
> %t10 = sext i8 %t8 to i32
> %t12 = mul i32 %t10, 1355350016
> %t13 = icmp eq i32 %t12, 65536
> ret i1 %t13
> }
>
> define i1 @tgt(i32 %ta) {
> ret i1 false
> }
>
> https://alive2.llvm.org/ce/z/pbFLS2
> https://gcc.godbolt.org/z/5ooM4f4rr
The multiply fold might unlock this case, but there's another problem here that
might be easier to solve - we don't thread the binop (shl in this case) back
through the select-of-constants to see if that reduces:
https://alive2.llvm.org/ce/z/zw2px6
(In reply to Sanjay Patel from comment #2)
> The multiply fold might unlock this case, but there's another problem here
> that might be easier to solve - we don't thread the binop (shl in this case)
> back through the select-of-constants to see if that reduces:
> https://alive2.llvm.org/ce/z/zw2px6
This should be a separate bug, so filed as bug 52406.
The generalization can be formalized in terms of modular arithmetic here (https://en.wikipedia.org/wiki/Modular_arithmetic). Optimizer is capable already of solving ordinary lineary equation (3x=9
<=> x=3
) to get rid of multiplication, but can't do the same for modular linear equation (7x=5 (mod 16)
<=> x=3 (mod 16)
(https://gcc.godbolt.org/z/3WjKcxGhf).
The equation ax=b (mod c)
can be solved (with certain constrains on constants) for any modulo, but c = 2^n
case is of particular interest. We can try to solve this for patterns like icmp(and(mul(...)))
, using Euclidean algorithm (https://en.wikipedia.org/wiki/Euclidean_algorithm), though I'm not sure if this special case is worth it.