Closed Quuxplusone closed 6 years ago
Bugzilla Link | PR38661 |
Status | RESOLVED FIXED |
Importance | P normal |
Reported by | Josh Stone (jistone@redhat.com) |
Reported on | 2018-08-21 10:05:09 -0700 |
Last modified on | 2018-09-14 16:03:24 -0700 |
Version | trunk |
Hardware | Other Linux |
CC | dje.gcc@gmail.com, glaubitz@physik.fu-berlin.de, hfinkel@anl.gov, lion@aosc.io, llvm-bugs@lists.llvm.org, nemanja.i.ibm@gmail.com, simonas+llvm.org@kazlauskas.me, tstellar@redhat.com |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also | https://github.com/rust-lang/rust/issues/50960 |
I bisected this back to r202451, which makes it a regression from LLVM 3.4 to
3.5. At that commit, the testcase throws a hard error:
Impossible reg-to-reg copy
UNREACHABLE executed at ../lib/Target/PowerPC/PPCInstrInfo.cpp:672!
This much was fixed soon after by r203041, and that gave me a clue. Moving
that fix a few lines earlier also corrects the issue here in call stack args:
diff --git a/lib/Target/PowerPC/PPCISelLowering.cpp
b/lib/Target/PowerPC/PPCISelLowering.cpp
index 037c4b5de9d1..a903f431ccee 100644
--- a/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -5463,10 +5463,11 @@ SDValue PPCTargetLowering::LowerCall_32SVR4(
Arg = PtrOff;
}
- if (VA.isRegLoc()) {
- if (Arg.getValueType() == MVT::i1)
- Arg = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i32, Arg);
+ // Promote booleans to 32-bit values.
+ if (Arg.getValueType() == MVT::i1)
+ Arg = DAG.getNode(ISD::ZERO_EXTEND, dl, MVT::i32, Arg);
+ if (VA.isRegLoc()) {
seenFloatArg |= VA.getLocVT().isFloatingPoint();
// Put argument in a physical register.
RegsToPass.push_back(std::make_pair(VA.getLocReg(), Arg));
I haven't looked deeper at the problem in the function definition yet, but I
believe that will be in PPCTargetLowering::LowerFormalArguments_32SVR4.
I am here. (waving)
For convenience, I will attach the C++ PoC here:
Library:
#include <iostream>
extern "C" void strange_function(
bool r3, bool r4, bool r5, bool r6, bool r7, bool r8, bool r9, bool r10,
bool s1, bool s2
) {
std::cout << r3 << r4 << r5 << r6 << r7 << r8 << r9 << r10 << s1 << s2 << std::endl;
};
Main program:
extern "C" void strange_function(
bool r3, bool r4, bool r5, bool r6, bool r7, bool r8, bool r9, bool r10,
bool s1, bool s2
);
int main() {
strange_function(true, true, true, true, true, true, true, true, true, true);
}
The expected result is '1111111111', but it printed '1111111100' on PPC, if
these two parts are compiled with -O0/-O1 vs. -O2/-O3.
The expected calling convention: Passing first 8 integer arguments by register
from r3 to r10, the stack part:
| XX XX XX XX <- r1 (a pointer to the previous frame, big-endian)
| 00 00 00 01 (0x01 extended to 0x00000001 and stored in big-endian)
V 00 00 00 01 (0x01 extended to 0x00000001 and stored in big-endian)
higher
But now the behaviour of LLVM is:
| XX XX XX XX <- r1 (a pointer to the previous frame, big-endian)
| 01 ?? ?? ?? (stored 0x01 to r1+0x8)
V 01 ?? ?? ?? (stored 0x01 to r1+0xC)
higher
It doesn't even clean the stack object and ignores 'zeroext' convention.
If a callee is using correct calling convention, it will read '0x??' and extend
to 32 bits, thus the program logic turns out to be an unknown behaviour.
Say we have a library for some Internet services, compiled with GCC.
start_server(Context *ctx, Address *addr, bool use_tls);
A frontend compiled with LLVM and -02(which is very common in release build)
called: start_server(.., .., true);
On callee side, use_tls can be true or false, depends on the background garbage
in the stack.
I checked i8 and i16, they don't have such issue.
> These problems do not arise with "llc -O0" or "-O1", which also means these
> two won't be bug-compatible if they're compiled with different optimization
> levels.
Yes and all conditions are as follow:
1. there are more than 8 integer arguments
2. using bool(i8) type in one of the 9th, 10th... arguments
3. optimisation level is >= 2
Minimum test:
#=======================================================================
target triple = "powerpc-unknown-linux-gnu"
define zeroext i1 @check_callee(
i1 zeroext %r3, i1 zeroext %r4, i1 zeroext %r5, i1 zeroext %r6,
i1 zeroext %r7, i1 zeroext %r8, i1 zeroext %r9, i1 zeroext %r10,
i1 zeroext %s1, i1 zeroext %s2
) {
call void @check_caller(
i1 zeroext true, i1 zeroext true, i1 zeroext true, i1 zeroext true,
i1 zeroext true, i1 zeroext true, i1 zeroext true, i1 zeroext true,
i1 zeroext %s1, i1 zeroext %s2)
ret i1 true
}
declare void @check_caller(
i1 zeroext, i1 zeroext, i1 zeroext, i1 zeroext,
i1 zeroext, i1 zeroext, i1 zeroext, i1 zeroext,
i1 zeroext, i1 zeroext
)
#=======================================================================
$ llc test.ll -o /dev/stdout -ppc-asm-full-reg-names -mattr=-crbits
=> Good
$ llc test.ll -o /dev/stdout -ppc-asm-full-reg-names -mattr=+crbits
=> Bad
After checking the whole PPCISelLowering.cpp I wrote this patch and submitted a revision at here: https://reviews.llvm.org/D51108.
Fix committed on trunk: https://reviews.llvm.org/rL342288