Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

clang crashes with "-mllvm -enable-newgvn": Assertion `ProcessedCount[V] < 100 && "Seem to have processed the same Value a lot"' failed #33444

Open Quuxplusone opened 7 years ago

Quuxplusone commented 7 years ago
Bugzilla Link PR34472
Status NEW
Importance P enhancement
Reported by Qirun Zhang (helloqirun@gmail.com)
Reported on 2017-09-05 00:36:51 -0700
Last modified on 2019-07-10 16:10:31 -0700
Version trunk
Hardware PC All
CC dberlin@dberlin.org, ditaliano@apple.com, florian_hahn@apple.com, llvm-bugs@lists.llvm.org, nick@wasmer.io, su@cs.ucdavis.edu
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also PR34935, PR37121, PR36500
It happens at r312510.

$ clang-trunk -v
clang version 6.0.0 (trunk 312510)
Target: x86_64-unknown-linux-gnu
Thread model: posix

$ clang-trunk -mllvm -enable-newgvn -O3 abc.c
clang-6.0: /home/absozero/trunk/llvm/lib/Transforms/Scalar/NewGVN.cpp:2853:
void {anonymous}::NewGVN::updateProcessedCount(const llvm::Value*): Assertion
`ProcessedCount[V] < 100 && "Seem to have processed the same Value a lot"'
failed.
#0 0x000000000232d56a llvm::sys::PrintStackTrace(llvm::raw_ostream&)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x232d56a)
#1 0x000000000232b6fe llvm::sys::RunSignalHandlers() (/home/absozero/trunk/root-
clang/bin/clang-6.0+0x232b6fe)
#2 0x000000000232b860 SignalHandler(int) (/home/absozero/trunk/root-
clang/bin/clang-6.0+0x232b860)
#3 0x00007f5a5126e340 __restore_rt (/lib/x86_64-linux-
gnu/libpthread.so.0+0x10340)
#4 0x00007f5a5005fcc9 gsignal /build/buildd/eglibc-
2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56:0
#5 0x00007f5a500630d8 abort /build/buildd/eglibc-2.19/stdlib/abort.c:91:0
#6 0x00007f5a50058b86 __assert_fail_base /build/buildd/eglibc-
2.19/assert/assert.c:92:0
#7 0x00007f5a50058c32 (/lib/x86_64-linux-gnu/libc.so.6+0x2fc32)
#8 0x0000000002229c50 (anonymous
namespace)::NewGVN::updateProcessedCount(llvm::Value const*)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x2229c50)
#9 0x0000000002235a6e (anonymous
namespace)::NewGVN::iterateTouchedInstructions() (/home/absozero/trunk/root-
clang/bin/clang-6.0+0x2235a6e)
#10 0x0000000002237598 (anonymous namespace)::NewGVN::runGVN()
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x2237598)
#11 0x0000000002239367 (anonymous
namespace)::NewGVNLegacyPass::runOnFunction(llvm::Function&) [clone .part.1772]
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x2239367)
#12 0x0000000001eaa693 llvm::FPPassManager::runOnFunction(llvm::Function&)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x1eaa693)
#13 0x000000000194fd26 (anonymous
namespace)::CGPassManager::runOnModule(llvm::Module&)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x194fd26)
#14 0x0000000001eaafaa llvm::legacy::PassManagerImpl::run(llvm::Module&)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x1eaafaa)
#15 0x00000000024cd275 (anonymous
namespace)::EmitAssemblyHelper::EmitAssembly(clang::BackendAction,
std::unique_ptr<llvm::raw_pwrite_stream,
std::default_delete<llvm::raw_pwrite_stream> >) (/home/absozero/trunk/root-
clang/bin/clang-6.0+0x24cd275)
#16 0x00000000024ce653 clang::EmitBackendOutput(clang::DiagnosticsEngine&,
clang::HeaderSearchOptions const&, clang::CodeGenOptions const&,
clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout
const&, llvm::Module*, clang::BackendAction,
std::unique_ptr<llvm::raw_pwrite_stream,
std::default_delete<llvm::raw_pwrite_stream> >) (/home/absozero/trunk/root-
clang/bin/clang-6.0+0x24ce653)
#17 0x0000000002cf1d5f
clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x2cf1d5f)
#18 0x0000000002f1533c clang::ParseAST(clang::Sema&, bool, bool)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x2f1533c)
#19 0x0000000002cf1200 clang::CodeGenAction::ExecuteAction()
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x2cf1200)
#20 0x0000000002882fe6 clang::FrontendAction::Execute()
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x2882fe6)
#21 0x0000000002855c46
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x2855c46)
#22 0x000000000291eba2
clang::ExecuteCompilerInvocation(clang::CompilerInstance*)
(/home/absozero/trunk/root-clang/bin/clang-6.0+0x291eba2)
#23 0x0000000000d47dc8 cc1_main(llvm::ArrayRef<char const*>, char const*,
void*) (/home/absozero/trunk/root-clang/bin/clang-6.0+0xd47dc8)
#24 0x0000000000cd51ac main (/home/absozero/trunk/root-clang/bin/clang-
6.0+0xcd51ac)
#25 0x00007f5a5004aec5 __libc_start_main /build/buildd/eglibc-2.19/csu/libc-
start.c:321:0
#26 0x0000000000d436e9 _start (/home/absozero/trunk/root-clang/bin/clang-
6.0+0xd436e9)
Stack dump:
0.  Program arguments: /home/absozero/trunk/root-clang/bin/clang-6.0 -cc1 -
triple x86_64-unknown-linux-gnu -emit-obj -disable-free -main-file-name abc.c -
mrelocation-model static -mthread-model posix -fmath-errno -masm-verbose -
mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -dwarf-
column-info -debugger-tuning=gdb -momit-leaf-frame-pointer -resource-dir
/home/absozero/trunk/root-clang/lib/clang/6.0.0 -c-isystem
/usr/local/include/csmith-2.3.0 -c-isystem . -cxx-isystem
/usr/local/include/csmith-2.3.0 -cxx-isystem . -internal-isystem
/usr/local/include -internal-isystem /home/absozero/trunk/root-
clang/lib/clang/6.0.0/include -internal-externc-isystem /usr/include/x86_64-
linux-gnu -internal-externc-isystem /include -internal-externc-isystem
/usr/include -O3 -fdebug-compilation-dir /home/absozero/projects/reduction -
ferror-limit 19 -fmessage-length 172 -fobjc-runtime=gcc -fdiagnostics-show-
option -fcolor-diagnostics -vectorize-loops -vectorize-slp -mllvm -enable-
newgvn -o /tmp/abc-fa1587.o -x c abc.c
1.  <eof> parser at end of file
2.  Per-module optimization passes
3.  Running pass 'CallGraph Pass Manager' on module 'abc.c'.
4.  Running pass 'Global Value Numbering' on function '@fn1'
clang-6.0: error: unable to execute command: Aborted (core dumped)
clang-6.0: error: clang frontend command failed due to signal (use -v to see
invocation)
clang version 6.0.0 (trunk 312510)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/bin
clang-6.0: note: diagnostic msg: PLEASE submit a bug report to
http://llvm.org/bugs/ and include the crash backtrace, preprocessed source, and
associated run script.
clang-6.0: note: diagnostic msg:
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-6.0: note: diagnostic msg: /tmp/abc-cc2508.c
clang-6.0: note: diagnostic msg: /tmp/abc-cc2508.sh
clang-6.0: note: diagnostic msg:

********************

$ cat abc.c
char a, f;
int b, c;
long d, e, g, h, i;
void fn1() {
  char j = 0;
  for (; b <= 3; b++)
    for (; a;) {
      long k;
      for (; k <= 3;)
        for (; i;) {
          h = 9;
          for (; h <= 65;) {
            k = j ?: (e = c);
            h &= d < 1;
            if (h)
              ;
            else
              for (;;)
                ;
          }
        l1:;
        }
    }
  for (;;) {
    for (; g;) {
      a = j % g % j;
      goto l1;
    }
    j = 9;
    for (; f;)
      ;
  }
}
Quuxplusone commented 7 years ago
Thanks. This doesn't have undef so it should just resolve.
I must have added a bug somewhere.
Quuxplusone commented 7 years ago
; ModuleID = 'bugpoint-reduced-simplified.bc'
source_filename = "bugpoint-output-49ea412.bc"
target triple = "x86_64-apple-darwin16.7.0"

define void @hoge() {
bb:
  br label %bb1

bb1:                                              ; preds = %bb12, %bb
  %tmp = phi i8 [ %tmp3, %bb12 ], [ 0, %bb ]
  br i1 undef, label %bb2, label %bb13

bb2:                                              ; preds = %bb4, %bb1
  %tmp3 = phi i8 [ %tmp, %bb1 ], [ %tmp5, %bb4 ]
  br i1 undef, label %bb12, label %bb4

bb4:                                              ; preds = %bb11, %bb2
  %tmp5 = phi i8 [ %tmp3, %bb2 ], [ %tmp7, %bb11 ]
  br i1 false, label %bb6, label %bb2

bb6:                                              ; preds = %bb9, %bb4
  %tmp7 = phi i8 [ %tmp5, %bb4 ], [ %tmp10, %bb9 ]
  br i1 undef, label %bb11, label %bb8

bb8:                                              ; preds = %bb8, %bb6
  br i1 true, label %bb8, label %bb9

bb9:                                              ; preds = %bb15, %bb8
  %tmp10 = phi i8 [ %tmp16, %bb15 ], [ %tmp7, %bb8 ]
  br label %bb6

bb11:                                             ; preds = %bb6
  br label %bb4

bb12:                                             ; preds = %bb2
  br label %bb1

bb13:                                             ; preds = %bb1
  br i1 undef, label %bb14, label %bb15

bb14:                                             ; preds = %bb13
  br label %bb15

bb15:                                             ; preds = %bb14, %bb13
  %tmp16 = phi i8 [ %tmp, %bb13 ], [ 9, %bb14 ]
  br label %bb9
}
Quuxplusone commented 7 years ago
It looks like the same issue, but it provides another reproducer.

$ clangpolly -v
clang version 6.0.0 (http://llvm.org/git/clang.git
5a2b037a53684751a712b11898a0cc69815f2ba7) (http://llvm.org/git/llvm.git
76db91a4f033defd405dfb4d14ba6c0c7e205ce0)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/su/bin
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9.4
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.4
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.3.0
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/6.2.0
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
$
$ clangpolly -O3 -mllvm -disable-llvm-optzns -c -emit-llvm -o small.bc small.c
$ optpolly -gvn -newgvn -o small-opt.bc small.bc
optpolly:
/home/su/software/tmp/polly/llvm/lib/Transforms/Scalar/NewGVN.cpp:2853: void
{anonymous}::NewGVN::updateProcessedCount(const llvm::Value*): Assertion
`ProcessedCount[V] < 100 && "Seem to have processed the same Value a lot"'
failed.
#0 0x00000000038ad6f4 (optpolly+0x38ad6f4)
#1 0x00000000038ad785 (optpolly+0x38ad785)
#2 0x00000000038abb9a (optpolly+0x38abb9a)
#3 0x00000000038ad077 (optpolly+0x38ad077)
#4 0x00007f0245cb0330 __restore_rt (/lib/x86_64-linux-
gnu/libpthread.so.0+0x10330)
#5 0x00007f0244a98c37 gsignal /build/eglibc-SvCtMH/eglibc-
2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56:0
#6 0x00007f0244a9c028 abort /build/eglibc-SvCtMH/eglibc-2.19/stdlib/abort.c:91:0
#7 0x00007f0244a91bf6 __assert_fail_base /build/eglibc-SvCtMH/eglibc-
2.19/assert/assert.c:92:0
#8 0x00007f0244a91ca2 (/lib/x86_64-linux-gnu/libc.so.6+0x2fca2)
#9 0x00000000036ee752 (optpolly+0x36ee752)
#10 0x00000000036f1227 (optpolly+0x36f1227)
#11 0x00000000036f1b75 (optpolly+0x36f1b75)
#12 0x00000000036f5310 (optpolly+0x36f5310)
#13 0x0000000003190d6a (optpolly+0x3190d6a)
#14 0x0000000003190efd (optpolly+0x3190efd)
#15 0x0000000003191298 (optpolly+0x3191298)
#16 0x00000000031919ef (optpolly+0x31919ef)
#17 0x0000000003191c2f (optpolly+0x3191c2f)
#18 0x00000000017fcbe8 (optpolly+0x17fcbe8)
#19 0x00007f0244a83f45 __libc_start_main /build/eglibc-SvCtMH/eglibc-
2.19/csu/libc-start.c:321:0
#20 0x00000000017d64f9 (optpolly+0x17d64f9)
Stack dump:
0.      Program arguments: optpolly -gvn -newgvn -o small-opt.bc small.bc
1.      Running pass 'Function Pass Manager' on module 'small.bc'.
2.      Running pass 'Global Value Numbering' on function '@fn2'
Aborted (core dumped)
$

----------------------------------------

int b, c, d;

int fn1 () { return 0; }

int fn2 ()
{
  int f;
 L:
  {
    int h;
    if (!b)
      {
        {
          int i = 0;
          for (; 1; )
            return i;
        }
        while (c)
          h = f;
        while (d)
          h = f;
      }
    else
      goto L;
  }
  f = fn1 ();
  goto L;
}

int main ()
{
  fn2 ();
  return 0;
}
Quuxplusone commented 7 years ago
Okay, so, trying to understand this to see what the end result really should be
:)

First, bb8->bb9 is unreachable, and bb4->bb6 is unreachable.

That gives tmp7 = tmp10 = tmp16
That immediately gives:

 bb1:                                              ; preds = %bb12, %bb
   %tmp = phi i8 [ %tmp3, %bb12 ], [ 0, %bb ]
   br i1 undef, label %bb2, label %bb13

 bb2:                                              ; preds = %bb4, %bb1
   %tmp3 = phi i8 [ 0, %bb1 ], [ %tmp5, %bb4 ]
   br i1 undef, label %bb12, label %bb4

 bb4:                                              ; preds = %bb11, %bb2
   %tmp5 = phi i8 [ %tmp3, %bb2 ], [ %tmp16, %bb11 ]
   br i1 false, label %bb6, label %bb2

 bb6:                                              ; preds = %bb9, %bb4
   %tmp7 = phi i8 [ %tmp16, %bb9 ]
   br i1 undef, label %bb11, label %bb8

 bb8:                                              ; preds = %bb8, %bb6
   br i1 true, label %bb8, label %bb9

 bb9:                                              ; preds = %bb15, %bb8
   %tmp10 = phi i8 [ %tmp16, %bb15 ]
   br label %bb6

 bb11:                                             ; preds = %bb6
   br label %bb4

 bb12:                                             ; preds = %bb2
   br label %bb1

 bb13:                                             ; preds = %bb1
   br i1 undef, label %bb14, label %bb15

 bb14:                                             ; preds = %bb13
   br label %bb15

 bb15:                                             ; preds = %bb14, %bb13
   %tmp16 = phi i8 [ %tmp, %bb13 ], [ 9, %bb14 ]

I'm thinking about what i believe the other values should be.  They are either
0, 9, or 0|9
Quuxplusone commented 7 years ago
The RPO order is :
Block bb is 0 in RPO order.
Block bb1 is 1 in RPO order.
Block bb13 is 2 in RPO order.
Block bb14 is 3 in RPO order.
Block bb15 is 4 in RPO order.
Block bb2 is 5 in RPO order.
Block bb4 is 6 in RPO order.
Block bb6 is 7 in RPO order.
Block bb8 is 8 in RPO order.
Block bb9 is 9 in RPO order.
Block bb11 is 10 in RPO order.
Block bb12 is 11 in RPO order.
bb8->bb9 is unreachable, and bb4->bb6 is unreachable.  Just removing the
unreachables, we have:

 bb1:                                              ; preds = %bb12, %bb
   %tmp = phi i8 [ %tmp3, %bb12 ], [ 0, %bb ]
   br i1 undef, label %bb2, label %bb13

 bb2:                                              ; preds = %bb4, %bb1
   %tmp3 = phi i8 [ 0, %bb1 ], [ %tmp5, %bb4 ]
   br i1 undef, label %bb12, label %bb4

 bb4:                                              ; preds = %bb11, %bb2
   %tmp5 = phi i8 [ %tmp3, %bb2 ], [ %tmp7, %bb11 ]
   br i1 false, label %bb6, label %bb2

 bb6:                                              ; preds = %bb9, %bb4
   %tmp7 = phi i8 [ %tmp10, %bb9 ]
   br i1 undef, label %bb11, label %bb8

 bb8:                                              ; preds = %bb8, %bb6
   br i1 true, label %bb8, label %bb9

 bb9:                                              ; preds = %bb15, %bb8
   %tmp10 = phi i8 [ %tmp16, %bb15 ]
   br label %bb6

 bb11:                                             ; preds = %bb6
   br label %bb4

 bb12:                                             ; preds = %bb2
   br label %bb1

 bb13:                                             ; preds = %bb1
   br i1 undef, label %bb14, label %bb15

 bb14:                                             ; preds = %bb13
   br label %bb15

 bb15:                                             ; preds = %bb14, %bb13
   %tmp16 = phi i8 [ %tmp, %bb13 ], [ 9, %bb14 ]

The RPO algorithm should go like this:
(there are no expression lookups that will ever find anything, so this should
be the same between the two value numbering algorithms)

Iteration 1
VN(tmp) = phi (VN(tmp3) == TOP, 0) == 0
VN(tmp16) = phi(VN(tmp) == 0, 9) == tmp16
VN(tmp3) = phi(0, VN(tmp5) == TOP) == 0
VN(tmp5) = phi(VN(tmp3) == 0, VN(tmp7) == TOP) == 0
VN(tmp7) = phi(VN(tmp10) = TOP = TOP
VN(tmp10) = phi(VN(tmp16) = tmp16 = tmp16

end of iteration 1 state
VN(tmp) = 0
VN(tmp16) = tmp16
VN(tmp3) = 0
VN(tmp5) = 0
VN(tmp7) = TOP
VN(tmp10 = tmp16

Iteration 2
VN(tmp) = phi(VN(tmp3) == 0, 0) = 0
VN(tmp16) = phi(VN(tmp) == 0), 9) = tmp16
VN(tmp3) = phi(0, VN(tmp5) == 0) = 0
VN(tmp5) = phi(VN(tmp3) == 0, VN(tmp7) == TOP) = 0
VN(tmp7) = phi(VN(tmp10) == tmp16) = tmp16
VN(tmp10) = phi(VN(tmp16) == tmp16) = tmp16

End of iteration 2 state
VN(tmp) = 0
VN(tmp16) = tmp16
VN(tmp3) = 0
VN(tmp5) = 0
VN(tmp7) = tmp16
VN(tmp10) = tmp16

Iteration 3:
VN(tmp) = phi(VN(tmp3) == 0, 0) = 0
VN(tmp16) = phi(VN(tmp) == 0), 9) = tmp16
VN(tmp3) = phi(0, VN(tmp5) == 0) = 0
VN(tmp5) = phi(VN(tmp3) == 0, VN(tmp7) == tmp16) = tmp5
VN(tmp7) = phi(VN(tmp10) == tmp16) = tmp16
VN(tmp10) = phi(VN(tmp16) == tmp16) = tmp16

End state of iteration 3:
VN(tmp) = 0
VN(tmp16) = tmp16
VN(tmp3) = 0
VN(tmp5) = tmp5
VN(tmp7) = tmp16
VN(tmp10) = tmp16

Iteration 4:
VN(tmp) = phi(VN(tmp3) == 0, 0) = 0
VN(tmp16) = phi(VN(tmp) == 0), 9) = tmp16
VN(tmp3) = phi(0, VN(tmp5) == tmp5) = tmp3
VN(tmp5) = phi(VN(tmp3) == tmp3, VN(tmp7) == tmp16) = tmp5
VN(tmp7) = phi(VN(tmp10) == tmp16) = tmp16
VN(tmp10) = phi(VN(tmp16) == tmp16) = tmp16
end state of iteration 4:
VN(tmp) = 0
VN(tmp16) = tmp16
VN(tmp3) = tmp3
VN(tmp5) = tmp5
VN(tmp7) = tmp16
VN(tmp10) = tmp16

Iteration 5:
VN(tmp) = phi(VN(tmp3) == tmp3, 0) = tmp
VN(tmp16) = phi(VN(tmp) == tmp), 9) = tmp16
VN(tmp3) = phi(0, VN(tmp5) == tmp5) = tmp3
VN(tmp5) = phi(VN(tmp3) == tmp3, VN(tmp7) == tmp16) = tmp5
VN(tmp7) = phi(VN(tmp10) == tmp16) = tmp16
VN(tmp10) = phi(VN(tmp16) == tmp16) = tmp16

end state of iteration 5:
VN(tmp) = tmp
VN(tmp16) = tmp16
VN(tmp3) = tmp3
VN(tmp5) = tmp5
VN(tmp7) = tmp16
VN(tmp10) = tmp16

Iteration 6:
VN(tmp) = phi(VN(tmp3) == tmp3, 0) = tmp
VN(tmp16) = phi(VN(tmp) == tmp), 9) = tmp16
VN(tmp3) = phi(0, VN(tmp5) == tmp5) = tmp3
VN(tmp5) = phi(VN(tmp3) == tmp3, VN(tmp7) == tmp16) = tmp5
VN(tmp7) = phi(VN(tmp10) == tmp16) = tmp16
VN(tmp10) = phi(VN(tmp16) == tmp16) = tmp16

end state of iteration 5:
VN(tmp) = tmp
VN(tmp16) = tmp16
VN(tmp3) = tmp3
VN(tmp5) = tmp5
VN(tmp7) = tmp16
VN(tmp10) = tmp16

Nothing has changed, this state should be the stable end state.

We should reach the same result.

I actually believe we should come to the same result in the same order, i will
see where we are going wrong.
Quuxplusone commented 7 years ago
Processing instruction   %tmp7 = phi i8 [ %tmp5, %bb4 ], [ %tmp10, %bb9 ]
bb4 -> bb6 is unreachable
No arguments of PHI node   %tmp7 = phi i8 [ %tmp5, %bb4 ], [ %tmp10, %bb9 ] are
live

This is wrong overall, it should end up as tmp16.
Quuxplusone commented 7 years ago

So, the bad news is i can prove that with one of the optimizations we do, the algorithm may not terminate.

I am going to disable that for now, and then i'll bring it back another way i can prove terminates, and then see how it could be integrated into the algorithm.

Quuxplusone commented 7 years ago
In particular, to write down some thoughts:

The RPO algorithm only has things fall down the lattice, essentially.
Partitions only split, things only really become less equivalent.   Our
algorithm essentially enables partition re-merges.  In the RPO world, it's easy
to guarantee termination by proving things only go in one direction on the
lattice.  In our world, termination is much harder to guarantee.

This means the RPO algorithm will not always produce optimal answers,
particularly if you try to merge it with unreachable code elimination.

Example:

 define void @hoge() {
 bb:
   br label %bb1

 bb1:                                              ; preds = %bb12, %bb
   %tmp = phi i32 [ 0, %bb ], [ 1, %bb12 ]
   br label %bb2

 bb2:                                              ; preds = %bb1
   br label %bb3

 bb3:                                              ; preds = %bb8, %bb2
   %tmp4 = phi i32 [ %tmp, %bb2 ], [ %tmp9, %bb8 ]
   br i1 undef, label %bb12, label %bb5

 bb5:                                              ; preds = %bb10, %bb3
   %tmp6 = phi i32 [ %tmp11, %bb10 ], [ %tmp4, %bb3 ]
   br label %bb7

 bb7:                                              ; preds = %bb5
   br i1 undef, label %bb10, label %bb8

 bb8:                                              ; preds = %bb7
   %tmp9 = phi i32 [ %tmp6, %bb7 ]
   br label %bb3

 bb10:                                             ; preds = %bb7
   %tmp11 = phi i32 [ %tmp6, %bb7 ]
   br label %bb5

 bb12:                                             ; preds = %bb3
   br label %bb1
 }
RPO order is file order (for our purposes)
RPO with all edges live:

Iteration 1:
VN(tmp) = tmp
VN(tmp4) = VN(tmp)=tmp, VN(tmp9)=TOP == tmp
VN(tmp6) = VN(tmp11)=TOP, VN(tmp4)=tmp == tmp
VN(tmp9) = VN(tmp6) == tmp
VN(tmp11) = VN(tmp6) = tmp

End state:
VN(everything) = tmp :)
This will remain the stable state next iteration.

If we ignore currently unreachable edges (bb12->bb1) to start, we get this:
Iteration 1:

VN(tmp) = 0
VN(tmp4) = VN(0)=0, VN(tmp9)=TOP == 0
VN(tmp6) = VN(tmp11)=TOP, VN(tmp4)=0 == 0
VN(tmp9) = VN(tmp6) == 0
VN(tmp11) = VN(tmp6) = 0

SO far so good
Now we've proved bb12->bb1 is reachable.

Iteration 2:

VN(tmp) = VN(0, 1) == tmp
VN(tmp4) = VN(tmp)=tmp, VN(tmp9)=0 == tmp4
VN(tmp6) = VN(tmp11)=0, VN(tmp4)=tmp4 == tmp6
VN(tmp9) = VN(tmp6) == tmp6
VN(tmp11) = VN(tmp6) = tmp6

Iteration 3:

VN(tmp) = VN(0, 1) == tmp
VN(tmp4) = VN(tmp)=tmp, VN(tmp9)=tmp6 == tmp4
VN(tmp6) = VN(tmp11)=tmp6, VN(tmp4)=tmp4 == tmp6
VN(tmp9) = VN(tmp6) == tmp6
VN(tmp11) = VN(tmp6) = tmp6

This is the stable end state.

Whoops!

Currently, we are performing an optimization that enables us to eliminate these
cycles and get the same answer as the non-unreachable code version.

Unfortunately, i can prove it is not terminating.

This leaves us with a few options:

1. Get the same "bad" answer as the RPO algorithm does after you start out by
ignoring some edges.
2. Build a separate phi optimization algorithm we run on each iteration to
destroy cycles.
3. Try some other approaches.
For example, we could reset all phi nodes in the same scc to TOP and keep going.
Because our edges only get *more* reachable, this should suffice to terminate.

I am going to experiment, but likely to start with option #1 if it starts
taking any real time to experiment.
Quuxplusone commented 7 years ago
probably the same (or similar)
https://bugs.llvm.org/show_bug.cgi?id=34935
Quuxplusone commented 6 years ago
FWIW: I have a half-done patch to do the proper scc finding and invalidation to
keep this optimization if anyone wants it.
Otherwise, i'd suggest just disabling the self-lookup optimization for now. It
can be proven not to converge without more.

Replace return lookupOperandLeader(P.first) != I with return true to make this
happen
Quuxplusone commented 6 years ago

Thanks Daniel!

FWIW: I have a half-done patch to do the proper scc finding and invalidation to keep this optimization if anyone wants it.

I would be happy to take a look! Otherwise I'll prepare a patch to disable the optimization for now!