Open JonPsson opened 1 year ago
$ time ./bin/llc -mtriple=s390x-linux-gnu -mcpu=z15 -O3 tc_SCEV_LDP.ll -o /dev/null
real 0m8.435s
user 0m8.419s
sys 0m0.016s
Works for me. Please provide better reproduction steps.
I tried again on very latest trunk and still see the segfault. I build on SystemZ with
-DCMAKE_BUILD_TYPE="Release" -DLLVM_ENABLE_ASSERTIONS=On -DLLVM_TARGETS_TO_BUILD=SystemZ
git commit 7cf5581
./bin/llc -mtriple=s390x-linux-gnu -O1 tc_SCEV_LDP.ll -o /dev/null >& /dev/null; echo $? Segmentation fault (core dumped) 139
Very interesting if you are not building on SystemZ and only I get this error...(?)
What platform are you building on? Do you still not get the segfault..?
This is plain debian gnu/linux, amd64, aa6ea6009fc50b02dbf3788ee9fe605081b154f6
$ ./bin/llc -mtriple=s390x-linux-gnu -O1 /tmp/tc_SCEV_LDP.ll -o /dev/null >& /dev/null; echo $?
0
$ sha512sum /tmp/tc_SCEV_LDP.ll
22502cc50503d8171583e8477597c032183f1682006c138c733f6780cfdc3c7175014c38265b7f996c544ab7dce8f8312ec6b439d7fbfa472c4f567ec08e57ea /tmp/tc_SCEV_LDP.ll
Please reopen with an actionable testcase.
One thing to note, how does it crash? Is there a stack overflow, and how many frames deep is it? Is there recursion at play?
@JonPsson what kind of build are you using? Is it debug build or release with asserts?
It's a stack overflow with +15k frames of SCEV calls. See above for the cmake options I used.
Repro:
$ ulimit -s 128 # !!!
$ ./bin/llc -mtriple=s390x-linux-gnu -mcpu=z15 -O3 /tmp/tc_SCEV_LDP.ll -o /dev/null
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: ./bin/llc -mtriple=s390x-linux-gnu -mcpu=z15 -O3 /tmp/tc_SCEV_LDP.ll -o /dev/null
1. Running pass 'Function Pass Manager' on module '/tmp/tc_SCEV_LDP.ll'.
2. Running pass 'Loop Data Prefetch' on function '@h'
#0 0x00007f43dfaa4ff3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /repositories/llvm-project/llvm/lib/Support/Unix/Signals.inc:567:13
#1 0x00007f43dfaa2e80 llvm::sys::RunSignalHandlers() /repositories/llvm-project/llvm/lib/Support/Signals.cpp:105:18
#2 0x00007f43dfaa54fa SignalHandler(int) /repositories/llvm-project/llvm/lib/Support/Unix/Signals.inc:412:1
#3 0x00007f43df65af90 (/lib/x86_64-linux-gnu/libc.so.6+0x3bf90)
#4 0x00007f43e07d56f1 llvm::ScalarEvolution::getRangeRef(llvm::SCEV const*, llvm::ScalarEvolution::RangeSignHint, unsigned int) /repositories/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp:6512:0
#5 0x00007f43e07c7e30 llvm::ScalarEvolution::getSignedRange(llvm::SCEV const*) /repositories/llvm-project/llvm/include/llvm/Analysis/ScalarEvolution.h:0:0
I can only repeat what i have said previously.
SCEV should not be recursive. getSCEV()
is still recursive.
It crashed for me on a machine with ulimit -s 8192 (+15k frames). Should the program compile also with your value of 128, which is considerably lower...?
It crashed for me on a machine with ulimit -s 8192 (+15k frames). Should the program compile also with your value of 128, which is considerably lower...?
I do not understand the question. I can't reproduce with normal settings here, but debian is using glibc, which has sane default stack size, so it's not unexpected. To reproduce, one needs to lower the stack size before running the reproducer.
With a value of 128 I can't even link llc, so it's not a big surprise with a crash then...
I have looked into this a bit more, and it seems that I got the crash on a machine where -fno-semantic-interposition was not used. So I think that if you build llc with
-DCMAKE_C_FLAGS_RELEASE="-fsemantic-interposition" -DCMAKE_CXX_FLAGS_RELEASE="-fsemantic-interposition"
you will also see the segfault...
I doubt it was genuinely caused by my patch, but maybe there is some old bug that was triggered by this change. Likely the very fact of constructing giant SCEVs here is the issue. We'll see into it.
I never thought so either - I think I have seen this problem before. I was merely hoping you might take a look as I saw you have been working on it and may be familiar with it... thanks.
testcase.tar.gz
The file has nearly 1000 small loops, so it's not a huge surprise that SCEV could get into trouble. I found by bisecting that before 0b74cb4 "[SCEV] Introduce field for storing SymbolicMaxNotTaken. NFCI", this program terminated normally after 15 seconds, but lately it is instead crashing. I am guessing the ideal behaviour would be to abort the optimization with some kind of limit in SCEV, where "uncomputable" would be returned...