llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.86k stars 11.91k forks source link

[indvars] Redundant loop increment after IV widening #28994

Open arpilipe opened 8 years ago

arpilipe commented 8 years ago
Bugzilla Link 28620
Version trunk
OS All
Attachments Reproducer
CC @hfinkel,@jrmuizel,@sanjoy,@rotateright

Extended Description

On the following loop indvars creates a wider IV but fails to eliminate a narrow type increment leading to an extra computation inside of the loop:

 loop:
  %i = phi i32 [ %i.next, %loop ], [ %i.entry, %entry ]

  %guard.cond = icmp ult i32 %i, %guard.1.limit
  call void (i1, ...) @​llvm.experimental.guard(i1 %guard.cond, i32 9) [ "deopt"(i32 1) ]

  %i.zext = zext i32 %i to i64
  call void @​consume(i64 %i.zext)

  %i.next = add i32 %i, 1
  %go.next = icmp slt i32 %i.next, %n
  br i1 %go.next, label %loop, label %exit

After indvars:

 loop:
  %indvars.iv = phi i64 [ %indvars.iv.next, %loop ], [ %0, %entry ]
  %guard.cond = icmp ult i64 %indvars.iv, %1
  call void (i1, ...) @​llvm.experimental.guard(i1 %guard.cond, i32 9) [ "deopt"(i32 1) ]
  call void @​consume(i64 %indvars.iv)
  %2 = trunc i64 %indvars.iv to i32
  %i.next = add i32 %2, 1
  %go.next = icmp slt i32 %i.next, %n
  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
  br i1 %go.next, label %loop, label %exit

The problem is not specific to guards, lowering it before indvars leads to the same result.

llvmbot commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#30428

llvmbot commented 8 years ago

Bug llvm/llvm-bugzilla-archive#30428 has been marked as a duplicate of this bug.

sanjoy commented 8 years ago

After looking at the example, I think even though we could solve this inside SimplifyIndvar::strengthenOverflowingOperation, we should try to do a more general transform. Specifically, I think teaching CorrelatedValuePropagation to mark additions and subtractions as nsw or nuw when legal (checking using LVI) will be more powerful than teaching SimplifyIndvar, since it would kick in for non-induction variables as well (and it would solve this problem too).

If we go with the plan above, I think there are three subtasks to getting the example in this bug working, all of which can be done in parallel:

Once the two things above are done, we should be optimizing the post-guard-lowering version of the example you've given. Then what remains is:

after which the example as given should work.

One thing I found suspicious -- the key method for step (1) is this:

/// Return the ConstantRange constraint that is known to hold for the /// specified value at the end of the specified block. This may only be called /// on integer-typed Values. ConstantRange getConstantRange(Value V, BasicBlock BB, Instruction *CxtI = nullptr);

and I couldn't quite figure out what the intended semantics are for CxtI, given that it states the range is valid at the end of BB. Hopefully "fixing" this will just require adjusting the documentation a bit.

arpilipe commented 8 years ago

assigned to @arpilipe