Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Failure to exploit loop invariant store #21796

Open Quuxplusone opened 9 years ago

Quuxplusone commented 9 years ago
Bugzilla Link PR21797
Status NEW
Importance P normal
Reported by listmail@philipreames.com
Reported on 2014-12-09 16:54:48 -0800
Last modified on 2020-12-08 20:09:15 -0800
Version trunk
Hardware PC Linux
CC alina.sbirlea@gmail.com, chandlerc@gmail.com, cs15btech11044@iith.ac.in, david.bolvansky@gmail.com, florian_hahn@apple.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, modocache@gmail.com, trent.tong@gmail.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
At O3, we are currently failing to exploit the loop invariant store in the
following example:
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @readmem() readonly

; Function Attrs: noreturn
define void @test1(i32* nonnull dereferenceable(4) %p, i32 %v) {
entry:
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  store i32 %v, i32* %p, align 4
  tail call void @readmem()
  br label %for.body
}

The store is of a loop invariant value to a loop invariant location.  The
address does not alias with any other store in the loop.  We should be able to
lift the store before the loop and execute it exactly once.

This might be implemented as a special case in LICM, but it could also be
handled by ensuring the loop is unrolled and recognizing that the store still
in the loop has become fully redundant.  I'm unsure which would be easier to
implement and/or more worthwhile.

I am unsure of the general applicability of this case.  I saw it come up in one
of our internal benchmarks, but only due to another optimizer bug.  I'm filing
it mostly because it seems like an interesting transform to consider.
Quuxplusone commented 9 years ago
I believe we can add if-case for store instructions here (after the load inst
if-case). and the code needs to check whether the stores are loaded by any
instruction in the load using the AliasSets.

/// canSinkOrHoistInst - Return true if the hoister and sinker can handle this
/// instruction.
///
bool LICM::canSinkOrHoistInst(Instruction &I) {
  // Loads have extra constraints we have to verify before we can hoist them.
  if (LoadInst *LI = dyn_cast<LoadInst>(&I)) {
    if (!LI->isUnordered())
      return false;        // Don't hoist volatile/atomic loads!

    // Loads from constant memory are always safe to move, even if they end up
    // in the same alias set as something that ends up being modified.
    if (AA->pointsToConstantMemory(LI->getOperand(0)))
      return true;
    if (LI->getMetadata(LLVMContext::MD_invariant_load))
      return true;

    // Don't hoist loads which have may-aliased stores in loop.
    uint64_t Size = 0;
    if (LI->getType()->isSized())
      Size = AA->getTypeStoreSize(LI->getType());

    AAMDNodes AAInfo;
    LI->getAAMetadata(AAInfo);

    return !pointerInvalidatedByLoop(LI->getOperand(0), Size, AAInfo);
  } else if (CallInst *CI = dyn_cast<CallInst>(&I)) {
    // Don't sink or hoist dbg info; it's legal, but not useful.
    if (isa<DbgInfoIntrinsic>(I))
      return false;
Quuxplusone commented 9 years ago
(In reply to comment #1)
> I believe we can add if-case for store instructions here (after the load
> inst if-case). and the code needs to check whether the stores are loaded by
> any instruction in the load using the AliasSets.
>
...

Yes, I think that's right.
Quuxplusone commented 9 years ago
In addition to the aforementioned place to modify, I think we need to change
isNotUsedInLoop to take care of the case in which we try to sink a store that
has aliasing uses (mod+ref) inside the loop.

void LICM::SinkRegion(DomTreeNode *N) {

    // Check to see if we can sink this instruction to the exit blocks
    // of the loop.  We can do this if the all users of the instruction are
    // outside of the loop.  In this case, it doesn't even matter if the
    // operands of the instruction are loop invariant.
    //
    if (isNotUsedInLoop(I) && canSinkOrHoistInst(I)) {
      ++II;
      sink(I);
    }
}

Using the AliasSet, we can find whether the store has aliasing uses inside the
loop or not.  If not we can go on to call canSinkOrHoistInst and sink/keep the
store accordingly.

In the canSinkOrHoistInst function, we can add a case for store. In this case,
we need to check whether there are any aliasing loads or stores not dominated
by the stores we are trying to hoist in the loop as well as whether the store
dominates the exit block.

I do not believe AliasSet keeps track of the instructions in which the pointers
are used. However, to use the dominance relation, we need to have the
instruction information.

Is there is an easier way to do this ?
Quuxplusone commented 9 years ago

Xin, to lift the store above the loop, I recommend you only worry about handling the case where the store is the only memory accessing instruction in the header. This guarantees that the store is visible to all loads in the loop. We do something similiar with the MayThrow flags to avoid introducing faults.

Get the simple case working (and checked in), then generalize if it's worthwhile.

The next logical extension would be to handle a store in the header which doesn't alias with other memory operations in the header. I suspect you could make that cheap as well.

In practice, unswitching and rotation due a remarkably good job of getting memory access instructions into the loop header. It doesn't always work, but unless you have a counter case you're particularly interested in, I'd suggest starting with the assumption it does.

Quuxplusone commented 9 years ago
Philip,  in case the store is the only instruction that accesses *p in the
basicblock, the aliasset promoter is able to promote it and eventually get rid
of the store all together. i.e. there is no need to have logic to hoist it out.

In case where the store does not alias with any other memory operations in the
basicblock, the aliasset promote can promotes it as well. because store to *p
is the only one in its aliasset.

However, for case in which call @readmem() has a mayAlias relation on *p, the
aliasset promoter gives up.

AliasSet[0x26eb0c0, 2] may alias, Mod/Ref   Pointers: (i32* %p,
18446744073709551615)
    1 Unknown instructions: void <badref>

void LICM::PromoteAliasSet(AliasSet &AS,
                           SmallVectorImpl<BasicBlock*> &ExitBlocks,
                           SmallVectorImpl<Instruction*> &InsertPts,
                           PredIteratorCache &PIC) {
  // We can promote this alias set if it has a store, if it is a "Must" alias
  // set, if the pointer is loop invariant, and if we are not eliminating any
  // volatile loads or stores.
  if (AS.isForwardingAliasSet() || !AS.isMod() || !AS.isMustAlias() ||
      AS.isVolatile() || !CurLoop->isLoopInvariant(AS.begin()->getValue()))
    return; /* GIVE UP HERE */

I think the one that can not be handled in the stored followed by an mayalias
load/store as what is shown in the your example. And this is what the
additional logic in canSinkOrHoistInst should do ?

Sorry if i am not make too much sense. I am new to LLVM.
Quuxplusone commented 5 years ago

I took a look at this report since it had been marked with the 'beginner' keyword, but I think I've reached the same conclusion as Xin. In the original description, it's said that "the store is of a loop invariant value to a loop invariant location," but I believe that cannot be known for certain, since the 'readmem' function may or may not read from that location. The alias set tracker correctly determines that 'readmem' may alias with the store, and so the LICM pass gives up on hoisting the store instruction.

Removing the call to 'readmem' from within the 'for.body' basic block and running LICM on the modified program results in the store instruction being hoisted, as described in this Bugzilla report.

Correct me if I'm wrong, but I believe that alias analysis and the LICM pass is working as expecting here, and so I think this report ought be closed.

Quuxplusone commented 5 years ago

Adding Alina as she has been looking a lot at making LICM more powerful (especially in the hoisting/sinking dept.) and may be able to generally provide thoughts / feedback.

Quuxplusone commented 5 years ago

Could you test this with -enable-mssa-loop-dependency?

I will look in detail starting next Tuesday.

Quuxplusone commented 5 years ago
Thanks for the suggestion, Alina and Chandler! Yes, when I run opt with -enable-
mssa-loop-dependency, the store instruction *is* hoisted out of the loop. Great
work, thanks!

I do wonder whether this report should be closed, though -- the instruction is
hoisted thanks to the new MSSA analysis, and it is not hoisted when that
analysis is unavailable (and therefore LICM can't determine aliasing fully).
Quuxplusone commented 3 years ago
(In reply to Brian Gesiak from comment #9)
> Thanks for the suggestion, Alina and Chandler! Yes, when I run opt with
> -enable-mssa-loop-dependency, the store instruction *is* hoisted out of the
> loop. Great work, thanks!
>

I am not sure if that is a recent regression, but it seems like it is not
hoisted out any longer for the original example: https://godbolt.org/z/hEWs9a
Quuxplusone commented 3 years ago

Worked llvm 8, regressed since llvm 9