Closed llvmbot closed 2 years ago
Fix in https://reviews.llvm.org/D43677.
Ok, this time I can reproduce the failure. The problem is that shrink-wrapping doesn't see MovePCtoLR8
as a def of a CSR. It then freely inserts the prologue in a successor to entry. As a result, we end up clobbering the LR before we've saved it in the prologue, which is obviously bad.
I'm testing a fix and will update this PR tomorrow with some results.
reproducer for release_60 OK, I was able to whittle the application down.
This example should reproduce the bl -> bl instruction sequence. The resulting executable crashes with a Segmentation fault on my system.
It does look like I need a new reproducer, since the characteristics of the original have been tickled somehow over time.
Just updated to r325786 on release_60 this morning, rebuilt, and get the same bug. The build is on a Power8 machine wiht gcc 5.4.0 and the test is run on a Power8 machine. There are no mods to the LLVM sources at all in this build.
% git remote -v origin http://llvm.org/git/llvm.git (fetch) origin http://llvm.org/git/llvm.git (push) % git checkout release_60 % git pull origin release_60
% cmake ReleaseSixSourceDir -DLLVM_TARGETS_TO_BUILD=PowerPC -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_TERMINFO=OFF -DCMAKE_EXE_LINKER_FLAGS="-static-libgcc -static-libstdc++" % make % make install
Note again that this bug requires the large code model. If I remove that, the test application will pass/run to correct completion.
Here are the updated LLVM tool commands.
% opt -O2 ./test.ll -S -o ./test.llvm % llc ./test.llvm -mcpu=pwr8 -O2 -code-model=large -o ./test.s % /usr/bin/as ./test.s -mpower8 -many -o test.o
PS: The number in the previous post is the git version hash that corresponds to r287059 (see, e.g. Hal's comment, above).
PPS: Community editions of the PGI compilers are available for download.
Unfortunately, I'm still unable to reproduce this. We should try to look into what the differences are between your environment and mine.
A similar code gen issue happens in release_60. In the test application, we have a call sequence as follows.
0x0000000010000ccc <+436>: bl 0x10000dc0 => 0x0000000010000cd0 <+440>: nop
The function called (0x10000dc0) is:
0x0000000010000db8 <+0>: ld r2,-8(r12) 0x0000000010000dbc <+4>: add r2,r2,r12 => 0x0000000010000dc0 <+8>: bl 0x10000dc4 0x0000000010000dc4 <+12>: mflr r6
So, we have a bl instruction jumping to another bl instruction, which will immediately overwrite the link register from the first call. After that, the application fails.
Oh wow. There's something really wrong if the first instruction in the local entry point is an unconditional branch and link instruction. The loop makes sense to me now - we set the link register to the beginning of the prologue (0x0000000010000dc4), save it and presumably restore it so the blr
brings control back to 0x0000000010000dc4.
I'll try to reproduce this.
As before, reverting 883332301 fixes the bug. What does this number mean? Is this the same revision (relative jump table encodings)?
A similar code gen issue happens in release_60. In the test application, we have a call sequence as follows.
0x0000000010000ccc <+436>: bl 0x10000dc0 => 0x0000000010000cd0 <+440>: nop
The function called (0x10000dc0) is:
0x0000000010000db8 <+0>: ld r2,-8(r12) 0x0000000010000dbc <+4>: add r2,r2,r12 => 0x0000000010000dc0 <+8>: bl 0x10000dc4 0x0000000010000dc4 <+12>: mflr r6
So, we have a bl instruction jumping to another bl instruction, which will immediately overwrite the link register from the first call. After that, the application fails.
As before, reverting 883332301 fixes the bug.
Second question:
The code sequence
bl .L0$pb
.L0$pb: ld 4, .LC0@toc@l(3)
loads the link register with the address of the ld instruction.
The code sequence
bltlr
branches to the link register if the condition is less than. The link register that is 4 instructions previous. This forms a conditional loop.
If we look at the body of the loop, there are no instructions that can change the condition, so if it is less than, it will always be less than in future iterations.
First question:
I am not sure if this bug was fixed after release 4.0. It is possible.
Does this still reproduce for you on ToT? I can't seem to reproduce it. In fact, there is no jump table in the produced assembly for the switch.
Furthermore, I'm not sure I really understand what the claimed problem is/was. Was it that the contents of the link register are incorrect so the bltlr
branches to the wrong place? Or was it that the branch wasn't taken when it should have been and then we loop forever in the loop within the test case? Or something entirely different?
% llc --version LLVM (http://llvm.org/): LLVM version 4.0.1 DEBUG build with assertions. Default target: powerpc64le-unknown-linux-gnu Host CPU: pwr8
Registered Targets: ppc32 - PowerPC 32 ppc64 - PowerPC 64 ppc64le - PowerPC 64 LE
By the way, the test provided is executable. When assembled and linked, the a.out file will run forever.
It's right at the top of the function. It is after the basic block #0 comment. So, replace "in the prologue" with "immediately after the prologue".
.Lfunc_toc0: # @z0042 .quad .TOC.-.Lfunc_gep0 z0042: .Lfunc_begin0: .Lfunc_gep0: ld 2, .Lfunc_toc0-.Lfunc_gep0(12) add 2, 2, 12 .Lfunc_lep0: .localentry z0042, .Lfunc_lep0-.Lfunc_gep0
addis 3, 2, .LC0@toc@ha
bl .L0$pb
.L0$pb:
Well, is it looping in the prologue or really in the epilogue? I don't understand how it could do the former, but there might be a clobber issue for the latter.
FYI: r287059 - Always use relative jump table encodings on PowerPC64.
assigned to @nemanjai
This has long since been fixed. I'm not sure why I never closed the issue/bug.
Extended Description
This bug appears in the release_40 branch on PowerPC.
To reproduce:
% opt -O2 bug.bc -o bug.opt.bc % llc bug.opt.bc -mcpu=native -code-model=large -o bug.s
In the prologue for z0042(), there should be the following sequence
BB#0: # %L.entry
.L0$pb: ld 4, .LC0@toc@l(3) mflr 3 lwz 4, 0(4) cmpwi 4, 1 bltlr 0
which is a call from the prologue to the next instruction. This call sends the program into an infinite loop.
The reproducer has reduced from a much larger code to isolate this specific code gen pattern.
The SVN commit that results in this failure was 287059. Reverting that changeset eliminates the infinite loop and corrects the test application.