llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.36k stars 12.14k forks source link

wrong code (compiled code hangs) at -O1 and above on x86_64-linux-gnu #27002

Closed zhendongsu closed 8 years ago

zhendongsu commented 8 years ago
Bugzilla Link 26628
Resolution FIXED
Resolved on Aug 03, 2016 16:55
Version trunk
OS All
CC @zmodem,@hfinkel,@preames,@nlewycky,@sanjoy

Extended Description

The following code is miscompiled by the current clang trunk on x86_64-linux-gnu at -O1 and above in both the 32-bit and 64-bit modes.

This is a regression from 3.7.x.

$ clang-trunk -v clang version 3.9.0 (trunk 260876) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /usr/local/tools/bin Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9 Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/4.9.3 Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5.3.0 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.4.7 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.6.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.7.3 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.8.4 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9.3 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/5.3.0 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/4.9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64 $ $ clang-trunk -O0 small.c; ./a.out $ clang-3.7 -O1 small.c; ./a.out $ $ clang-trunk -O1 small.c $ timeout -s 9 10 ./a.out Killed $


volatile int a, b, c = 2; unsigned d;

int main () { int e = -32; d = -31; for (; d > 2; d++) for (e++;; b--) if (c) break; e && a; return 0;
}

zmodem commented 8 years ago

Sounds like this was fixed in r260981?

zhendongsu commented 8 years ago

Bug llvm/llvm-project#26990 has been marked as a duplicate of this bug.

preames commented 8 years ago

Reverted.

Sending lib/Analysis/LazyValueInfo.cpp Sending test/Transforms/CorrelatedValuePropagation/basic.ll Transmitting file data .. Committed revision 260981.

I'll prepare a new review with a fix. I'll make sure to include a test case derived from this bug which would have caught this problem.

preames commented 8 years ago

I'm about to revert the original change. I'll come up with a new version of the patch which doesn't suffer from the issue Sanjoy pointed out and post it for review.

@​Zhendong - Thank you for finding this. The small reproducer really really helped in this case. Trying to diagnose this from a large application failure would not have been fun.

@​Sanjoy - Thanks for diagnosing the problem. Much appreciated.

sanjoy commented 8 years ago

The change looks "obviously correct" on first glance, so I dug a little deeper.

Looks like we got bit by an issue in the ConstantRange API -- makeNoWrapRegion returns a constant range such that the operation it is asked about is guaranteed not to wrap. That is not what we want here -- we want a range such that the range excludes the values for which the operation definitely wraps. So, in the buggy transform, LVI thinks that X in "X +(nuw, nsw) 1" belongs to [0,INT_SMAX), when ConstantRange actually said "if X is in [0,INT_MAX) then X+1 will definitely not signed/unsigned wrap", not that "given that X+1 will not wrap, X is definitely in [0,INT_SMAX)". This is the same difference we have between makeAllowedICmpRegion and makeSatisfyingICmpRegion. In fact, X is very well allowed to be -2, even though -2 is not in [0, INT_SMAX). [1]

I really should have caught this in the review, since I've been bit by this before :(. Whoever fixes the bug (I'm happy to volunteer for this, since I'm responsible for the confusing API here) should rename makeNoWrapRegion to something more obvious and add some more detailed examples in the doc.

Note: the usage in SCEV is fine since it uses makeNoWrapRegion to compute what flags are safe to put on an expression, the inverse of what LVI is using it for.

[1]: In this specific case, where the RHS of makeNoWrapRegion is a constant integer, it is easy to think this difference is an artifact of non-precise ranges. But this is a more general issue -- e.g. makeNoWrapRegion(Add, [0, 2), NUW) will have to be [0,-1), even though -1 + 0 doesn't unsigned wrap, and there isn't a problem of imprecise ranges in this specific case.

sanjoy commented 8 years ago

Reverting r260705 fixes the problem:

""" [LVI] Exploit nsw/nuw when computing constant ranges

As the title says. Modelled after similar code in SCEV.

This is useful when analysing induction variables in loops which have been canonicalized by other passes. I wrote the tests as non-loops specifically to avoid the generality introduced in http://reviews.llvm.org/D17174. While that can handle many induction variables without needing to exploit nsw, there's no reason not to use it if we've already proven it.

Differential Revision: http://reviews.llvm.org/D17177

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260705 91177308-0d34-0410-b5e6-96231b3b80d8 """

(Not minimal) IR reproducer: -jump-threading on

target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.11.0"

@​c = global i32 2, align 4 @​d = common global i32 0, align 4 @​b = common global i32 0, align 4 @​a = common global i32 0, align 4

define i32 @​main() { entry: store i32 -31, i32* @​d br label %for.body

for.body: ; preds = %for.inc2, %entry %e.09 = phi i32 [ -32, %entry ], [ %inc, %for.inc2 ] %inc = add nuw nsw i32 %e.09, 1 %0 = load volatile i32, i32* @​c %tobool8 = icmp eq i32 %0, 0 br i1 %tobool8, label %for.inc.preheader, label %for.inc2

for.inc.preheader: ; preds = %for.body br label %for.inc

for.inc: ; preds = %for.inc.preheader, %for.inc %1 = load volatile i32, i32 @​b %dec = add nsw i32 %1, -1 store volatile i32 %dec, i32 @​b %2 = load volatile i32, i32* @​c %tobool = icmp eq i32 %2, 0 br i1 %tobool, label %for.inc, label %for.inc2.loopexit

for.inc2.loopexit: ; preds = %for.inc br label %for.inc2

for.inc2: ; preds = %for.inc2.loopexit, %for.body %exitcond = icmp eq i32 %inc, -1 br i1 %exitcond, label %for.end4, label %for.body

for.end4: ; preds = %for.inc2 store i32 0, i32* @​d br i1 false, label %for.end4.land.end_crit_edge, label %land.rhs

for.end4.land.end_crit_edge: ; preds = %for.end4 br label %land.end

land.rhs: ; preds = %for.end4 %3 = load volatile i32, i32* @​a br label %land.end

land.end: ; preds = %for.end4.land.end_crit_edge, %land.rhs ret i32 0 }

transforms br i1 %tobool, label %for.inc, label %for.inc2.loopexit to br i1 %tobool8, label %for.inc, label %for.body (which is incorrect).

zhendongsu commented 8 years ago

assigned to @preames