Open jayfoad opened 4 years ago
Doesn't reproduce anymore, so I'll assume it is fixed.
Here's an updated repro for the same kind of failure:
target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7"
define void @f(i32 %arg) {
bb:
%i = and i32 %arg, 31
%i1 = icmp ugt i32 %i, 0
br i1 %i1, label %bb2, label %bb8
bb2:
%i3 = phi i32 [ %i6, %bb2 ], [ 0, %bb ]
%i4 = shl i32 %i3, 1
%i5 = call i32 @g(i32 %i4)
%i6 = add i32 %i3, 1
%i7 = icmp eq i32 %i6, %i
br i1 %i7, label %bb8, label %bb2
bb8:
ret void
}
declare i32 @g(i32)
opt -passes=loop-reduce -verify-scev -verify-scev-strict -o /dev/null r.ll
:
Trip Count for Loop at depth 1 containing: %bb2<header><latch><exiting>
Changed!
Old: (-1 + (zext i5 (trunc i32 %arg to i5) to i32))<nsw>
New: ((-2 + (2 * (zext i5 (trunc i32 %arg to i5) to i32))<nuw><nsw>)<nsw> /u 2)
Delta: (-1 + (zext i5 (trunc i32 %arg to i5) to i32) + (-1 * ((-2 + (2 * (zext i5 (trunc i32 %arg to i5) to i32))<nuw><nsw>)<nsw> /u 2))<nsw>)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/jayfoad2/llvm-debug/bin/opt -passes=loop-reduce -verify-scev -verify-scev-strict -o /dev/null r.ll
#0 0x0000555eca02318a llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:602:11
#1 0x0000555eca02333b PrintStackTraceSignalHandler(void*) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:676:1
#2 0x0000555eca0218a6 llvm::sys::RunSignalHandlers() /home/jayfoad2/git/llvm-project/llvm/lib/Support/Signals.cpp:104:5
#3 0x0000555eca023b55 SignalHandler(int) /home/jayfoad2/git/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
#4 0x00007f0bf5e5e520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
#5 0x00007f0bf5eb2a7c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
#6 0x00007f0bf5eb2a7c __pthread_kill_internal ./nptl/pthread_kill.c:78:10
#7 0x00007f0bf5eb2a7c pthread_kill ./nptl/pthread_kill.c:89:10
#8 0x00007f0bf5e5e476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
#9 0x00007f0bf5e447f3 abort ./stdlib/abort.c:81:7
#10 0x0000555ec88ae958 llvm::ScalarEvolution::verify() const /home/jayfoad2/git/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:14062:3
#11 0x0000555eca6b6807 llvm::FunctionToLoopPassAdaptor::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /home/jayfoad2/git/llvm-project/llvm/lib/Transforms/Scalar/LoopPassManager.cpp:325:9
#12 0x0000555eca554ad7 llvm::detail::PassModel<llvm::Function, llvm::FunctionToLoopPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /home/jayfoad2/git/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:17
#13 0x0000555ec958de3a llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /home/jayfoad2/git/llvm-project/llvm/include/llvm/IR/PassManager.h:521:33
#14 0x0000555ec6b9b767 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) /home/jayfoad2/git/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:17
#15 0x0000555ec958cb85 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/jayfoad2/git/llvm-project/llvm/lib/IR/PassManager.cpp:124:38
#16 0x0000555ec6b9b337 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/jayfoad2/git/llvm-project/llvm/include/llvm/IR/PassManagerInternal.h:89:17
#17 0x0000555ec958d0ba llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) /home/jayfoad2/git/llvm-project/llvm/include/llvm/IR/PassManager.h:521:33
#18 0x0000555ec62d8ca7 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool) /home/jayfoad2/git/llvm-project/llvm/tools/opt/NewPMDriver.cpp:522:3
#19 0x0000555ec630aa3a main /home/jayfoad2/git/llvm-project/llvm/tools/opt/opt.cpp:719:12
#20 0x00007f0bf5e45d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#21 0x00007f0bf5e45e40 call_init ./csu/../csu/libc-start.c:128:20
#22 0x00007f0bf5e45e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
#23 0x0000555ec62d13a5 _start (/home/jayfoad2/llvm-debug/bin/opt+0x578f3a5)
Aborted (core dumped)
@max-quazan FYI
Do we know the patch when it started?
Do we know the patch when it started?
I don't know, sorry, but I think it was probably a long time ago.
FWIW, a verify-scev-strict
failure like this isn't a bug, just a potential optimization opportunity. The trip counts are equivalent, but SCEV isn't able to detect that the new one can be simplified to the old one.
FWIW, a
verify-scev-strict
failure like this isn't a bug, just a potential optimization opportunity. The trip counts are equivalent, but SCEV isn't able to detect that the new one can be simplified to the old one.
That's a strange use of "verify". Should it be an Optimization Remark instead?
FWIW, a
verify-scev-strict
failure like this isn't a bug, just a potential optimization opportunity. The trip counts are equivalent, but SCEV isn't able to detect that the new one can be simplified to the old one.That's a strange use of "verify". Should it be an Optimization Remark instead?
verify-scev-strict can fail either because of a bug, or because SCEV cannot determine equivalence of expressions. This is why by default it is not enabled, and instead only the cases that are "definitely wrong" are detected.
It's intended as a verification option, but it has too many false positives to be practical for most purposes.
As to why this particular issue occurs, this is the usual problem with add -> sub canonicalization. The sub 2
here is nuw and it would be safe to fold the division into the operands, but add -2
is not nuw.
This looks like a pretty basic case though, so maybe it would be worthwhile to special case it.
I tried to add support for this like this:
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 424ff1c26c68..07307980e287 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -3536,6 +3536,23 @@ const SCEV *ScalarEvolution::getUDivExpr(const SCEV *LHS,
if (Operands.size() == A->getNumOperands())
return getAddExpr(Operands);
}
+
+ // (Op+NegC)/RHSC == (Op-C)/RHSC == (Op/RHSC) - (C/RHSC) if
+ // the subtract is nuw and the divisions are exact. This is handled
+ // separately because SCEV can't represent sub nuw.
+ const auto *NegC = dyn_cast<SCEVConstant>(A->getOperand(0));
+ if (A->getNumOperands() == 2 && NegC && NegC->getAPInt().isNegative()) {
+ const SCEV *Op = A->getOperand(1);
+ APInt C = -NegC->getAPInt();
+ APInt NewC = C.udiv(RHSC->getAPInt());
+ if (C == NewC * RHSC->getAPInt() &&
+ getUnsignedRange(Op).unsignedSubMayOverflow(C) ==
+ ConstantRange::OverflowResult::NeverOverflows) {
+ const SCEV *NewOp = getUDivExpr(Op, RHS);
+ if (!isa<SCEVUDivExpr>(NewOp) && getMulExpr(NewOp, RHS) != Op)
+ return getMinusSCEV(NewOp, getConstant(NewC));
+ }
+ }
}
// Fold if both operands are constant.
But this doesn't work because we don't know that (zext i5 (trunc i32 %arg to i5) to i32)
is not zero. Or rather, we only know that after loop guard application, but loop guards aren't used for the BECount itself.
Looking at this SCEV:
New: ({-1,+,-2}<%bb1> + ({1,+,2}<%bb1> umax {2,+,2}<%bb1>))
why didn't we combine AR1 + umax(AR2, AR3)
into umax(AR1 + AR2, AR1 + AR3)
? Because their steps are opposite, it should be OK to just simplify it into a constant. I guess the only precondition here is that start(AR1) + umax(start(AR2), start(AR3)) == umax(start(AR1) + start(AR2), start(AR1) + start (AR3))
which I believe is true.
With the attached r.ll (in r.zip) and a debug build of opt, I get: