llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.11k stars 12.01k forks source link

Pointless PHI after loop rotation #45928

Open LebedevRI opened 4 years ago

LebedevRI commented 4 years ago
Bugzilla Link 46583
Version trunk
OS Linux
CC @fhahn,@preames,@nikic,@rotateright

Extended Description

https://godbolt.org/z/d2JskK

Given

%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ] %0 = and i32 %i.0, 1 <...> %inc = add nuw nsw i32 %i.0, 1

loop rotation produces

%0 = phi i32 [ 0, %for.body.lr.ph ], [ %2, %for.body ] %i.010 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ] <...> %inc = add nuw nsw i32 %i.010, 1 %2 = and i32 %inc, 1

Which seems kinda sub-optimal for obvious reasons - more registers used, %inc remains anyway, we loose potential to lower it to test, etc.

Where would be the best place to undo this this?

LebedevRI commented 4 years ago

Hm, hacked it together. It works on the test case, but doesn't really help with larger cases. But i've found where we cause the problems in larger case.

LebedevRI commented 4 years ago

Step 0: unsquattering LoopSink name https://reviews.llvm.org/D83686

LebedevRI commented 4 years ago

Thanks, I see the point now. Loop rotate acts very straightforwardly, creating new Phis for every def from for.cond block, even the defs that can be safely sunk down from it. I'm not well familiar with loop rotate code, but I'd suggest to check if we can skip Phi creation for defs that can be sunk down (or maybe do such sinking before the actual rotation).

Hmm, thinking about this a bit more, this might be a lion in sheep's clothing, this might be one of the major things that obscured many loops i've looked at from vectorizer.

By "before the actual rotation", do you suggest to do that before the whole loop rotation pass is run, outside of it, because e.g. that can affect the profitability check of the loop rotation (rotation-max-header-size)?

llvmbot commented 4 years ago

Thanks, I see the point now. Loop rotate acts very straightforwardly, creating new Phis for every def from for.cond block, even the defs that can be safely sunk down from it. I'm not well familiar with loop rotate code, but I'd suggest to check if we can skip Phi creation for defs that can be sunk down (or maybe do such sinking before the actual rotation).

LebedevRI commented 4 years ago

Hi Roman, what exactly are you trying to achieve in the end? Looking at the resulting IR, I don't see how it can be simplified.

https://godbolt.org/z/yFA44H

llvmbot commented 4 years ago

Hi Roman, what exactly are you trying to achieve in the end? Looking at the resulting IR, I don't see how it can be simplified.