Open rotateright opened 2 years ago
The generalization can be formalized in terms of modular arithmetic here (https://en.wikipedia.org/wiki/Modular_arithmetic). Optimizer is capable already of solving ordinary lineary equation (3x=9
<=> x=3
) to get rid of multiplication, but can't do the same for modular linear equation (7x=5 (mod 16)
<=> x=3 (mod 16)
(https://gcc.godbolt.org/z/3WjKcxGhf).
The equation ax=b (mod c)
can be solved (with certain constrains on constants) for any modulo, but c = 2^n
case is of particular interest. We can try to solve this for patterns like icmp(and(mul(...)))
, using Euclidean algorithm (https://en.wikipedia.org/wiki/Euclidean_algorithm), though I'm not sure if this special case is worth it.
The multiply fold might unlock this case, but there's another problem here that might be easier to solve - we don't thread the binop (shl in this case) back through the select-of-constants to see if that reduces: https://alive2.llvm.org/ce/z/zw2px6
This should be a separate bug, so filed as bug 52406.
Here is the sample of similar nature, showing how we miss full folding at -O3:
define i1 @src(i32 %ta) { %t3 = trunc i32 %ta to i8 %t4 = and i8 %t3, 8 %t6.not = icmp eq i8 %t4, 0 %t7 = select i1 %t6.not, i8 7, i8 0 %t8 = shl i8 %t4, %t7 %t10 = sext i8 %t8 to i32 %t12 = mul i32 %t10, 1355350016 %t13 = icmp eq i32 %t12, 65536 ret i1 %t13 }
define i1 @tgt(i32 %ta) { ret i1 false }
https://alive2.llvm.org/ce/z/pbFLS2 https://gcc.godbolt.org/z/5ooM4f4rr
The multiply fold might unlock this case, but there's another problem here that might be easier to solve - we don't thread the binop (shl in this case) back through the select-of-constants to see if that reduces: https://alive2.llvm.org/ce/z/zw2px6
Here is the sample of similar nature, showing how we miss full folding at -O3:
define i1 @src(i32 %ta) { %t3 = trunc i32 %ta to i8 %t4 = and i8 %t3, 8 %t6.not = icmp eq i8 %t4, 0 %t7 = select i1 %t6.not, i8 7, i8 0 %t8 = shl i8 %t4, %t7 %t10 = sext i8 %t8 to i32 %t12 = mul i32 %t10, 1355350016 %t13 = icmp eq i32 %t12, 65536 ret i1 %t13 }
define i1 @tgt(i32 %ta) { ret i1 false }
https://alive2.llvm.org/ce/z/pbFLS2 https://gcc.godbolt.org/z/5ooM4f4rr
assigned to @anton-afanasyev
Extended Description
Forking this off from bug 52289 - I don't know what the generalization is, but we're missing some kind of overflowing/shifting/multiplying magic:
https://alive2.llvm.org/ce/z/7Ft6jA