Open LebedevRI opened 4 years ago
Hm, hacked it together. It works on the test case, but doesn't really help with larger cases. But i've found where we cause the problems in larger case.
Step 0: unsquattering LoopSink
name
https://reviews.llvm.org/D83686
Thanks, I see the point now. Loop rotate acts very straightforwardly, creating new Phis for every def from for.cond block, even the defs that can be safely sunk down from it. I'm not well familiar with loop rotate code, but I'd suggest to check if we can skip Phi creation for defs that can be sunk down (or maybe do such sinking before the actual rotation).
Hmm, thinking about this a bit more, this might be a lion in sheep's clothing, this might be one of the major things that obscured many loops i've looked at from vectorizer.
By "before the actual rotation", do you suggest to do that before
the whole loop rotation pass is run, outside of it,
because e.g. that can affect the profitability check
of the loop rotation (rotation-max-header-size
)?
Thanks, I see the point now. Loop rotate acts very straightforwardly, creating new Phis for every def from for.cond block, even the defs that can be safely sunk down from it. I'm not well familiar with loop rotate code, but I'd suggest to check if we can skip Phi creation for defs that can be sunk down (or maybe do such sinking before the actual rotation).
Hi Roman, what exactly are you trying to achieve in the end? Looking at the resulting IR, I don't see how it can be simplified.
Hi Roman, what exactly are you trying to achieve in the end? Looking at the resulting IR, I don't see how it can be simplified.
Extended Description
https://godbolt.org/z/d2JskK
Given
%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ] %0 = and i32 %i.0, 1 <...> %inc = add nuw nsw i32 %i.0, 1
loop rotation produces
%0 = phi i32 [ 0, %for.body.lr.ph ], [ %2, %for.body ] %i.010 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ] <...> %inc = add nuw nsw i32 %i.010, 1 %2 = and i32 %inc, 1
Which seems kinda sub-optimal for obvious reasons - more registers used, %inc remains anyway, we loose potential to lower it to
test
, etc.Where would be the best place to undo this this?