Open LebedevRI opened 5 years ago
And finally, https://reviews.llvm.org/D69125
And there's obviously a case where mask will not be needed: https://rise4fun.com/Alive/UCv https://rise4fun.com/Alive/rVA
Name: a, normal+mask Pre: C1+C2 u>= 16 %C1_wide = zext i16 C1 to i32 %onebit = shl i32 -1, %C1_wide %mask = xor i32 %onebit, -1 %masked = and i32 %mask, %x %truncated = trunc i32 %masked to i16 %r = shl i16 %truncated, C2 => %x_trunc = trunc i32 %x to i16 %r = shl i16 %x_trunc, C2
Name: c, normal+mask Pre: (C2-C1) >= -16 %C1_wide = zext i16 C1 to i32 %t0 = lshr i32 -1, %C1_wide %t1 = and i32 %t0, %x %t2 = trunc i32 %t1 to i16 %r = shl i16 %t2, C2 => %x_trunc = trunc i32 %x to i16 %r = shl i16 %x_trunc, C2
One last thing i will need to check here is whether it's meaningful to handle cases where's trunc in the pattern (before last shift).
Ha, that's nice, it kinda just works. https://rise4fun.com/Alive/elH https://rise4fun.com/Alive/icy
Name: a, normal+mask %onebit = shl i32 -1, C1 %mask = xor i32 %onebit, -1 %masked = and i32 %mask, %x %truncated = trunc i32 %masked to i16 %r = shl i16 %truncated, C2 => %C2_wide = zext i16 C2 to i32 %n0 = shl i32 %x, %C2_wide %n1 = add i32 C1, %C2_wide %n2 = zext i32 %n1 to i64 %n3 = shl i64 -1, %n2 %n4 = xor i64 %n3, -1 %n5 = trunc i64 %n4 to i16 %n6 = trunc i32 %n0 to i16 %r = and i16 %n6, %n5
Name: c, normal+mask %t0 = lshr i32 -1, C1 %t1 = and i32 %t0, %x %t2 = trunc i32 %t1 to i16 %r = shl i16 %t2, C2 => %C2_wide = zext i16 C2 to i32 %n0 = shl i32 %x, %C2_wide %n1_0 = sub i32 %C2_wide, C1 %n1_1 = sub i32 0, %n1_0 %n1 = add i32 %n1_1, 32 %n2 = zext i32 %n1 to i64 %n3 = lshr i64 -1, %n2 %n4 = trunc i64 %n3 to i16 %n5 = trunc i32 %n0 to i16 %r = and i16 %n4, %n5
One last thing i will need to check here is whether it's meaningful to handle cases where's trunc in the pattern (before last shift).
And posted
https://rise4fun.com/Alive/x2Cq Not unexpectedly, same is true for patterns a,b https://rise4fun.com/Alive/Mazd
... and to have nice handling of non-splats, while for patterns c-f this will just work https://rise4fun.com/Alive/Lqj Right, of course it won't needs zext/trunc too https://rise4fun.com/Alive/gslRa
, for a,b
some messing with zext/trunc will be needed https://rise4fun.com/Alive/plgI Hoping to finally look into that now..
And finally, https://reviews.llvm.org/D67677 (patterns a,b only)
And for c,d,e - https://reviews.llvm.org/D67725
https://rise4fun.com/Alive/x2Cq Not unexpectedly, same is true for patterns a,b https://rise4fun.com/Alive/Mazd
... and to have nice handling of non-splats, while for patterns c-f this will just work https://rise4fun.com/Alive/Lqj Right, of course it won't needs zext/trunc too https://rise4fun.com/Alive/gslRa
, for a,b
some messing with zext/trunc will be needed https://rise4fun.com/Alive/plgI Hoping to finally look into that now..
And finally, https://reviews.llvm.org/D67677 (patterns a,b only)
https://rise4fun.com/Alive/x2Cq Not unexpectedly, same is true for patterns a,b https://rise4fun.com/Alive/Mazd
... and to have nice handling of non-splats, while for patterns c-f this will just work https://rise4fun.com/Alive/Lqj, for a,b some messing with zext/trunc will be needed https://rise4fun.com/Alive/plgI Hoping to finally look into that now..
And finally, https://reviews.llvm.org/D67677 (patterns a,b only)
https://rise4fun.com/Alive/x2Cq Not unexpectedly, same is true for patterns a,b https://rise4fun.com/Alive/Mazd
... and to have nice handling of non-splats, while for patterns c-f this will just work https://rise4fun.com/Alive/Lqj, for a,b some messing with zext/trunc will be needed https://rise4fun.com/Alive/plgI Hoping to finally look into that now..
https://rise4fun.com/Alive/x2Cq Not unexpectedly, same is true for patterns a,b https://rise4fun.com/Alive/Mazd
Actually, i'm missing the point here. While indeed if the mask does not touch any bits that will remain after the shift means we can drop the mask, we still can trivially do the fold if it does touch some bits - that legality check also tells us how many high bits to clear.
https://rise4fun.com/Alive/x2Cq
Obviously, that won't work if we had ashr
.
The mask is constant, so we only need one extra instruction.
I guess that means the input to %r
needs to be one-use (or %x a constant.)
Not seeing any obvious follow-up improvements for this at the moment. I'm going to consider this resolved. Can be reopened if needed.
Okay, all 6 patterns/patches are up. I've tried to make those as easy for review as possible.
Starting to post patches, top-of-stack: https://reviews.llvm.org/D64512
assigned to @LebedevRI
Extended Description
Splitting off from llvm/llvm-project#41801 just so i can track them separately.
There's a bunch of patterns, even if we only look at 'masking'-like pattern. I guess i'm stumbling into a deep untouched woods that no one had in hotpaths..
a,b: https://rise4fun.com/Alive/4zsf c,d,e,f: https://rise4fun.com/Alive/RC49