llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.57k stars 11.81k forks source link

Infinite loop at -O2 and above at compile-time on clang 12.0.x to trunk version #50999

Closed haoxintu closed 3 years ago

haoxintu commented 3 years ago
Bugzilla Link 51657
Resolution FIXED
Resolved on Sep 06, 2021 10:01
Version trunk
OS Linux
CC @rotateright,@oToToT
Fixed by commit(s) 0d83e7203479

Extended Description

Hi all.

The following test program makes clang 12.0.x to trunk version hang on -O2 and above.

$cat small.c

include

int a,b,c; void d(int e) { int8_t f; int16_t g; int32_t i = &a; uint16_t j; int8_t *k = &c; int16_t l = 246; uint64_t m; int8_t n = &k; int64_t o; int16_t p; for (; p;) { int64_t q = o; for (q = 5; q; q += 1) if (k = b) for (j = 3; j; j++) { int8_t r; o = r; } for (; p <= 2; p++) s: l = 1; } g = m = e; uint64_t v; int32_t u = &i; uint64_t t = &v; f = u; f = c = l; v = (g ?: (u = m << n)) == f; for (; i <= 8; f = t) ; goto s; }

$clang -w -O2 -m32 small.c //endless compiling, same as -O3 and -Os

$time clang -c -w -O1 -m32 small.c

real 0m0.059s user 0m0.028s sys 0m0.031s

The clang version I used: clang version 14.0.0 (https://github.com/llvm/llvm-project 022538f2764a255bd2c0da3a247791e764933a93) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /home/haoxin/haoxin-data/compilers/llvm-project/build/bin Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/8 Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/9 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Candidate multilib: x32;@mx32 Selected multilib: .;@m64

Reproduced in Godbolt: https://godbolt.org/z/ndxn1cT91

Thanks, Haoxin

haoxintu commented 3 years ago

Hey Sanjay. Thanks for your time and insightful comments here!

I think I got the answer from your detailed explanation. I am happy to find important (potentially) bugs in compilers to make them more reliable and thank you so much again for spending time to fix them!

Best, Haoxin

rotateright commented 3 years ago

Just for curious, may I ask do you know why the source code example can not reproduce the issue but IR code can? Or in other words, does it happen oftentimes that some bugs can only be triggered by IR code?

Hi Haoxin -

Thank you for finding and reporting bugs!

For this example, it takes a rare sequence of unoptimized IR instructions to trigger the bug in instcombine. (It is possible that the regression test that I created for this could be reduced a bit more, but not too much.)

And so that IR sequence would usually be optimized away by other passes or instcombine itself. That's why the bug has been hiding silently in LLVM for a very long time (maybe 10 years!).

I did not check exactly how https://reviews.llvm.org/rG7b0d59da9af4bf4eb made the bug invisible, but we know from Dawid's comment 2 that the bug must be in instcombine, so a patch in another pass could not have fixed or caused the root problem.

So I do not have a good answer to your question about frequency of bugs like this, but there are definitely many cases where a bug in some particular LLVM pass is invisible from C source (Clang) because other optimization passes prevent the problem pattern from being encountered.

Some other researchers/bots are fuzzing specific IR passes or sets of passes, and it yields bugs. The disadvantage of that approach is that the importance may not be as high if people think we can't possibly see the bug from Clang.

haoxintu commented 3 years ago

It's just not visible with the source example in this report.

Thank you all for your checking and fix!

And hey, Sanjay. Just for curious, may I ask do you know why the source code example can not reproduce the issue but IR code can? Or in other words, does it happen oftentimes that some bugs can only be triggered by IR code? I don't know in what situations that source code may lose information after transforming to IR. I super appreciate it if you can give me any hints. Thanks for your time!

Best wishes, Haoxin

rotateright commented 3 years ago

I wasn't thinking about backporting a fix to the 13.0 release since the bug was already hidden in trunk using this example...so I did some cleanup and tried to fix another bug before this one: https://reviews.llvm.org/rGa73973c9d461 https://reviews.llvm.org/rGfbb78668f2ee https://reviews.llvm.org/rG982a15cb3fa0 https://reviews.llvm.org/rGc85f450619f7 https://reviews.llvm.org/rG0d83e7203479

So if we do want to backport a fix, I think we'd need to take all of those to patch cleanly.

I'll mark this as fixed for now.

If someone wants to fix it in 13.0 too, please re-open.

rotateright commented 3 years ago

Created attachment 25232 [details] Instcombine stuck reproducer

Thanks! So the bug is still present in trunk. It's just not visible with the source example in this report. I'll take a look at fixing it.

llvmbot commented 3 years ago

Full instcombine debug log

llvmbot commented 3 years ago

Instcombine stuck reproducer

rotateright commented 3 years ago

Apparently inscombine stuck on visiting trunc/shl/and instructions, every time reaching 49623fa77a35de343e89ea2d8159ce719473ce71 code path:

IC: Visiting: %sext199 = shl i64 %shl, 24 IC: Visiting: %shl.tr100 = and i64 %sext199, 72057594021150720 IC: Visiting: %sext1 = trunc i64 %shl.tr100 to i32

Can you paste the full IR for that function before it enters instcombine?

oToToT commented 3 years ago

The exact commit is 7b0d59da9af4bf4eb8342cac579e42a818ac1ae7. After this commit, I can't reproduce this problem with the given code.

llvmbot commented 3 years ago

Apparently inscombine stuck on visiting trunc/shl/and instructions, every time reaching 49623fa77a35de343e89ea2d8159ce719473ce71 code path:

IC: Visiting: %sext199 = shl i64 %shl, 24 IC: Visiting: %shl.tr100 = and i64 %sext199, 72057594021150720 IC: Visiting: %sext1 = trunc i64 %shl.tr100 to i32 IC: ADD DEFERRED: %shl.tr100 = and i64 %sext199, 72057594021150720 IC: Mod = %sext1 = trunc i64 %shl.tr100 to i32 New = %sext1 = trunc i64 %sext199 to i32 IC: ADD: %sext1 = trunc i64 %sext199 to i32 IC: ERASE %shl.tr100 = and i64 %sext199, 72057594021150720 IC: ADD DEFERRED: %sext199 = shl i64 %shl, 24 IC: ADD: %sext199 = shl i64 %shl, 24 IC: Visiting: %sext199 = shl i64 %shl, 24 IC: Visiting: %sext1 = trunc i64 %sext199 to i32 IC: ADD DEFERRED: %shl.tr = trunc i64 %shl to i32 IC: Old = %sext1 = trunc i64 %sext199 to i32 New = = shl i32 %shl.tr, 24 IC: ADD: %sext1 = shl i32 %shl.tr, 24 IC: ERASE %4 = trunc i64 %sext199 to i32 IC: ADD DEFERRED: %sext199 = shl i64 %shl, 24 IC: ERASE %sext199 = shl i64 %shl, 24 IC: ADD DEFERRED: %shl = shl i64 %conv19, 4294967286 IC: ADD: %shl.tr = trunc i64 %shl to i32 IC: Visiting: %shl.tr = trunc i64 %shl to i32 IC: Visiting: %sext1 = shl i32 %shl.tr, 24 IC: ADD DEFERRED: %sext1101 = shl i64 %shl, 24 IC: ADD DEFERRED: %shl.tr102 = and i64 %sext1101, 72057594021150720 IC: Old = %sext1 = shl i32 %shl.tr, 24 New = = trunc i64 %shl.tr102 to i32 IC: ADD: %sext1 = trunc i64 %shl.tr102 to i32 IC: ERASE %4 = shl i32 %shl.tr, 24 IC: ADD DEFERRED: %shl.tr = trunc i64 %shl to i32 IC: ERASE %shl.tr = trunc i64 %shl to i32 IC: ADD DEFERRED: %shl = shl i64 %conv19, 4294967286 IC: ADD: %shl.tr102 = and i64 %sext1101, 72057594021150720 IC: ADD: %sext1101 = shl i64 %shl, 24 IC: Visiting: %sext1101 = shl i64 %shl, 24 IC: Visiting: %shl.tr102 = and i64 %sext1101, 72057594021150720 IC: Visiting: %sext1 = trunc i64 %shl.tr102 to i32

Not sure how bad is badref occurrence in this case. Problem can be reproduced on release/12 branch but is not seen on trunk anymore.

oToToT commented 3 years ago

Not sure whether this is helpful, but this bug exists after commit 49623fa77a35de343e89ea2d8159ce719473ce71.