llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.24k stars 11.66k forks source link

A very small test resulting in extremely long compilation in "clang -O2" version 3.5, trunk 199677 #18980

Open llvmbot opened 10 years ago

llvmbot commented 10 years ago
Bugzilla Link 18606
Version trunk
OS MacOS X
Blocks llvm/llvm-project#16805
Attachments Sources of t27056 test including its source, input, and outputs, Sources of t84350 test including its source, input, and outputs
Reporter LLVM Bugzilla Contributor
CC @hfinkel,@joker-eph,@sunfishcode,@arsenm

Extended Description

This is one of the tests generated by icFuzz in the attachment of comment http://llvm.org/bugs/show_bug.cgi?id=16431#c30

the test resuls in an extremely long compilation in clang version 3.5, trunk 199677. Clang -O2 compilation seems to run until it goes out of memory. The core of the test is very simple:

for (kj = 1; kj < 2; ++kj) {
    for (jd = 1; jd < 32; ++jd) {
        zz *= zz;
    }
}

I guess the problem is a result of unrolling and creating a sequence of statements that result in analysis nodes that have exponential complexity. If the 2nd loop upper bound (32) is changed to a larger number (e.g., 132) or a much smaller number (e.g., 12), compilation terminates fast.

llvmbot commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#28830

joker-eph commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#28830

llvmbot commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#28721

llvmbot commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#18607

llvmbot commented 7 years ago

This issue is still reproducible on this IR:

define i32 @​test(i32, i32) { bci_0: br label %bci_30

bci_68: ; preds = %bci_45 %local6.lcssa = phi i32 [ %local6, %bci_45 ] %.lcssa1.lcssa = phi i32 [ %37, %bci_45 ] %.lcssa.lcssa = phi i32 [ 34, %bci_45 ] %2 = add i32 %local6.lcssa, 262 %3 = add i32 %2, %.lcssa1.lcssa %4 = add i32 %3, %.lcssa.lcssa ret i32 %4

bci_30: ; preds = %bci_45, %bci_0 %local0 = phi i32 [ %0, %bci_0 ], [ %38, %bci_45 ] %local6 = phi i32 [ 2, %bci_0 ], [ %39, %bci_45 ] %5 = add i32 %local6, %local0 br label %bci_45

bci_45: ; preds = %bci_30 %6 = mul i32 %5, %5 %7 = mul i32 %6, %6 %8 = mul i32 %7, %7 %9 = mul i32 %8, %8 %10 = mul i32 %9, %9 %11 = mul i32 %10, %10 %12 = mul i32 %11, %11 %13 = mul i32 %12, %12 %14 = mul i32 %13, %13 %15 = mul i32 %14, %14 %16 = mul i32 %15, %15 %17 = mul i32 %16, %16 %18 = mul i32 %17, %17 %19 = mul i32 %18, %18 %20 = mul i32 %19, %19 %21 = mul i32 %20, %20 %22 = mul i32 %21, %21 %23 = mul i32 %22, %22 %24 = mul i32 %23, %23 %25 = mul i32 %24, %24 %26 = mul i32 %25, %25 %27 = mul i32 %26, %26 %28 = mul i32 %27, %27 %29 = mul i32 %28, %28 %30 = mul i32 %29, %29 %31 = mul i32 %30, %30 %32 = mul i32 %31, %31 %33 = mul i32 %32, %32 %34 = mul i32 %33, %33 %35 = mul i32 %34, %34 %36 = mul i32 %35, %35 %37 = mul i32 %36, %36 %38 = add i32 %37, -11 %39 = add i32 %local6, 1 %40 = icmp sgt i32 %39, 76 br i1 %40, label %bci_68, label %bci_30 }

I suspect this might be because we don't cash results inside of the "ScalarEvolution::GetMinTrailingZeros"

llvmbot commented 7 years ago

This problem should be fixed by r284868 and r284784.

joker-eph commented 8 years ago

Bug llvm/llvm-bugzilla-archive#28830 has been marked as a duplicate of this bug.

arsenm commented 8 years ago

Bug 28721 may possibly be related

joker-eph commented 8 years ago

Seems like a fix was proposed here: http://reviews.llvm.org/D3127 but wasn't reviewed.

llvmbot commented 10 years ago

Bug llvm/llvm-bugzilla-archive#18607 has been marked as a duplicate of this bug.

llvmbot commented 10 years ago

The issue is that after vectorization by a factor of 32, you have a chain of:

%mul.1 = mul i32 %mul, %mul %mul.2 = mul i32 %mul.1, %mul.1 ....

I feel that it means that for the 32th, the "flatten" multiplication contains 2^32 terms. It seems that ScalarEvolution is not protected against that.

Here is the IR for t270563 just before the "infinite" loop:

IR Dump Before Induction Variable Simplification for.cond2.preheader: ; preds = %entry, %for.body4 %inc67 = phi i32 [ 1, %entry ], [ %inc6, %for.body4 ] %mul.lcssa5 = phi i32 [ %zz.promoted4, %entry ], [ %mul.30, %for.body4 ] br label %for.body4

for.body4: ; preds = %for.cond2.preheader %mul = mul i32 %mul.lcssa5, %mul.lcssa5 %mul.1 = mul i32 %mul, %mul %mul.2 = mul i32 %mul.1, %mul.1 %mul.3 = mul i32 %mul.2, %mul.2 %mul.4 = mul i32 %mul.3, %mul.3 %mul.5 = mul i32 %mul.4, %mul.4 %mul.6 = mul i32 %mul.5, %mul.5 %mul.7 = mul i32 %mul.6, %mul.6 %mul.8 = mul i32 %mul.7, %mul.7 %mul.9 = mul i32 %mul.8, %mul.8 %mul.10 = mul i32 %mul.9, %mul.9 %mul.11 = mul i32 %mul.10, %mul.10 %mul.12 = mul i32 %mul.11, %mul.11 %mul.13 = mul i32 %mul.12, %mul.12 %mul.14 = mul i32 %mul.13, %mul.13 %mul.15 = mul i32 %mul.14, %mul.14 %mul.16 = mul i32 %mul.15, %mul.15 %mul.17 = mul i32 %mul.16, %mul.16 %mul.18 = mul i32 %mul.17, %mul.17 %mul.19 = mul i32 %mul.18, %mul.18 %mul.20 = mul i32 %mul.19, %mul.19 %mul.21 = mul i32 %mul.20, %mul.20 %mul.22 = mul i32 %mul.21, %mul.21 %mul.23 = mul i32 %mul.22, %mul.22 %mul.24 = mul i32 %mul.23, %mul.23 %mul.25 = mul i32 %mul.24, %mul.24 %mul.26 = mul i32 %mul.25, %mul.25 %mul.27 = mul i32 %mul.26, %mul.26 %mul.28 = mul i32 %mul.27, %mul.27 %mul.29 = mul i32 %mul.28, %mul.28 %mul.30 = mul i32 %mul.29, %mul.29 %inc6 = add i32 %inc67, 1 %cmp = icmp ult i32 %inc6, 2 br i1 %cmp, label %for.cond2.preheader, label %for.end7