Open dylanmckay opened 6 years ago
lib::num::flt2dec::strategy
In our own libcore, we've disabled a lot of this code because it's simply too big. This may not be an extra blocker from moving to LLVM 6.
Do I understand this code correct. It tries to return a struct with the fields initialized to zero?
And if this is the machine code:
bb.0.start:
%2:ld8 = LDIRdK 0
%3:dldregs = LDIWRdK 0
$r21r20 = COPY %2
$r23r22 = COPY %3
$r24 = COPY %3
RET implicit $r21r20, implicit $r23r22, implicit $r24
Then the for some reason the copy instructions for 16bit registers and 8bit registers get mixed up. This doesn't sound to hard to fix.
It tries to return a struct with the fields initialized to zero?
That seems like it to me!
This doesn't sound to hard to fix.
Awesome!
~I think I found it. In ~AVRTargetLowering::LowerReturn
the destination of the copy is reversed because of the calling convention but not the source.
~~~But here we go [gist](https://gist.github.com/brainlag/309d47cd78aedc081773d7e14a65468d):~~~
```diff
diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 7ac8a136e6b..4f7d3d6a815 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1371,16 +1371,12 @@ AVRTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv,
MachineFunction &MF = DAG.getMachineFunction();
unsigned e = RVLocs.size();
- // Reverse splitted return values to get the "big endian" format required
- // to agree with the calling convention ABI.
- if (e > 1) {
- std::reverse(RVLocs.begin(), RVLocs.end());
- }
-
SDValue Flag;
SmallVector<SDValue, 4> RetOps(1, Chain);
// Copy the result values into the output registers.
- for (unsigned i = 0; i != e; ++i) {
+ // Reverse loop to get the "big endian" format required
+ // to agree with the calling convention ABI.
+ for (int i = (e - 1); i < 0; --i) {
CCValAssign &VA = RVLocs[i];
assert(VA.isRegLoc() && "Can only return in registers!");
diff --git a/test/CodeGen/AVR/rust-avr-bug-92.ll b/test/CodeGen/AVR/rust-avr-bug-92.ll
new file mode 100644
index 00000000000..f93bf7a152d
--- /dev/null
+++ b/test/CodeGen/AVR/rust-avr-bug-92.ll
@@ -0,0 +1,6 @@
+; RUN: llc < %s -march=avr | FileCheck %s
+
+define { i8, i32 } @chapstick() {
+start:
+ ret { i8, i32 } zeroinitializer
+}
\ No newline at end of file
```
Can you narrow down what exactly the problem is? Is it just that the condition (if (e > 1)
) has been removed? Is it that the call to std::reverse
is "broken"?
I wonder if it would still work if the only change was removing if (e > 1) {
.
Damn, I should have run the tests before. It breaks a lot of stuff. It made so much sense.
I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57). It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.
I don't know if this produces correct output but it prevents the crash.
This patch just avoids to return a struct through registers. Which only works for
a few/small structs anyway.
https://gist.github.com/brainlag/4d9217eefcac1f06df682415975d97c3
diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 7ac8a136e6b..e0bb2f317d1 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1346,7 +1346,7 @@ AVRTargetLowering::CanLowerReturn(CallingConv::ID CallConv,
CCState CCInfo(CallConv, isVarArg, MF, RVLocs, Context);
auto CCFunction = CCAssignFnForReturn(CallConv);
- return CCInfo.CheckReturn(Outs, CCFunction);
+ return !MF.getFunction().getReturnType()->isStructTy() && CCInfo.CheckReturn(Outs, CCFunction);
}
SDValue
diff --git a/test/CodeGen/AVR/rust-avr-bug-92.ll b/test/CodeGen/AVR/rust-avr-bug-92.ll
new file mode 100644
index 00000000000..e76ce630c67
--- /dev/null
+++ b/test/CodeGen/AVR/rust-avr-bug-92.ll
@@ -0,0 +1,7 @@
+; RUN: llc < %s -march=avr | FileCheck %s
+
+; CHECK-LABEL: main
+define { i8, i32 } @main() {
+start:
+ ret { i8, i32 } zeroinitializer
+}
\ No newline at end of file
I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57). It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.
This makes a lot of sense. Sounds like we need to adjust this logic and write some more calling convention tests.
I don't know if this produces correct output but it prevents the crash. This patch just avoids to return a struct through registers. Which only works for a few/small structs anyway.
I suspect that this patch will break the calling convention for all functions which return structs, as they will now always be passed on the stack. FWIW, code compiled with this patch will work with code compiled under the same patch, but it would not work when compiling against objects from an older version of LLVM or AVR-GCC generated objects.
I am fairly busy the next handful of days, but after that this bug is at the top of my priority list. If anybody wants to look at this in the interim (please do!), feel free to ask questions on this issue, or ping me on Gitter.
In the past to fix calling convention issues like this, I've taken a C file and compared it side-by-side with LLVM/GCC compiled code.
I've found that this is the best way to make sure our calling convention implementation is accurate. It looks like we are missing struct-return tests, which would be good to add.
I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57). It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.
This makes a lot of sense. Sounds like we need to adjust this logic and write some more calling convention tests.
I have code for this, but it felt like a ugly hack to me and after testing with a couple of different structs I found it really hard to find structs which don't use the stack. If a struct doesn't fit within 3 (or maybe for 4?) registered it used the stack even without this patch.
I update with examples later.
I tried it again. If the struct is <= 64 bit you can pass it without the stack.
Great work!
I will review the patch in the next few days.
Since @dylanmckay has been busy, and I've been trying to get engaged in this projct, I applied the patch and tried to compile libcore.
After working around #95 by lowering the optimization level, I got a new crash:
rustc: /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8736: void llvm::SelectionDAGISel::LowerArguments(const llvm::Function&): Assertion `InVals.size() == Ins.size() && "LowerFormalArguments didn't emit the correct number of values!"' failed.
(gdb) bt
#0 0x00007ffff71f8860 in raise () from /usr/lib/libc.so.6
#1 0x00007ffff71f9ec9 in abort () from /usr/lib/libc.so.6
#2 0x00007ffff71f10bc in __assert_fail_base () from /usr/lib/libc.so.6
#3 0x00007ffff71f1133 in __assert_fail () from /usr/lib/libc.so.6
#4 0x00007fffeca26c14 in llvm::SelectionDAGISel::LowerArguments (
this=this@entry=0x7fffb61d9000, F=...)
at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8735
#5 0x00007fffeca77d6e in llvm::SelectionDAGISel::SelectAllBasicBlocks (
this=this@entry=0x7fffb61d9000, Fn=...)
at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1402
#6 0x00007fffeca78d72 in llvm::SelectionDAGISel::runOnMachineFunction (
this=<optimized out>, mf=...)
at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#7 0x00007fffecce12c5 in llvm::MachineFunctionPass::runOnFunction (
this=0x7fffb61d9000, F=...)
at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/MachineFunctionPass.cpp:62
(gdb) l 8735
8730 DAG.getRoot(), F.getCallingConv(), F.isVarArg(), Ins, dl, DAG, InVals);
8731
8732 // Verify that the target's LowerFormalArguments behaved as expected.
8733 assert(NewRoot.getNode() && NewRoot.getValueType() == MVT::Other &&
8734 "LowerFormalArguments didn't return a valid chain!");
8735 assert(InVals.size() == Ins.size() &&
8736 "LowerFormalArguments didn't emit the correct number of values!");
8737 DEBUG({
8738 for (unsigned i = 0, e = Ins.size(); i != e; ++i) {
8739 assert(InVals[i].getNode() &&
I poked around a bit, but as ignorant as I am about LLVM, it wasn't very insightful.
EDIT: I guess one thing is worth adding. If I counted fields correctly, the length of InVals is 8, while the length of Ins is 16.
as ignorant as I am about LLVM
Welcome to the club! 😸
Welcome to the club! :smile_cat:
Thanks! :smile:
In that case, does @brainlag have any more ideas?
Are you sure picked up the patch for #57?
I just checked, @brainlag, and the patch for #57 is the very last commit I have. I am using @shepmaster's rebase here.
Ok, you should pin the problem down by making a reduced testcase. Then create a new bug with the LLVM IR code that triggers the bug.
Whoops! It seems that I was using the wrong version of llc
. Nevermind.
Thanks for the link about reduced testcases, though. I always wondered how those got generated. :smile:
@brainlag to double-check: should both this patch and this patch be applied, or does the second one replace the first?
Only the second one.
Only the second one.
Cool. I've crossed out the first one to help in the future.
The current patch causes a few test failures.
I fixed the ones in add.ll
and sub.ll
in this patch. Two other tests are failing.
The tests are failing because registers are changing. I am unsure if the function calling convention argument registers have changed. The two tests I fixed up made the IR significantly uglier (see add+sub).
IIRC this patch did not break any tests. There must be something else going on.
I suspect these two tests are existing failures - I'm going to check again.
Is there an update on this? If this hasn't been fixed, is this a good problem for a beginner (partially to Rust and completely to LLVM)?
Yes, this should be good, @dylanmckay just need to verify it and upstream it.
Hi!
I recently tried to build avr-rust at the current Rust HEAD and attempted to build stock libcore. (I'm not really sure if that is supposed to work or not). After patching rustc
to handle separate program-data address spaces (I'll send a PR for that ~soon), I ran into the impossible reg-to-reg copy issue. I applied the "current patch" referenced above, however that still fails with:
llc: /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8603: std::pair<llvm::SDValue, llvm::SDValue> llvm::TargetLowering::LowerCallTo(llvm::TargetLowering::CallLoweringInfo&) const: Assertion `EVT(CLI.Ins[i].VT) == InVals[i].getValueType() && "LowerCall emitted a value with the wrong type!"' failed.
Stack dump:
0. Program arguments: llc min0.ll
1. Running pass 'Function Pass Manager' on module 'min0.ll'.
2. Running pass 'AVR DAG->DAG Instruction Selection' on function '@core_iter_TakeWhile_try_fold_closure'
[...]
#9 0x00007f58ae86dcb9 llvm::TargetLowering::LowerCallTo(llvm::TargetLowering::CallLoweringInfo&) const /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8602:0
#10 0x00007f58ae860d25 llvm::SelectionDAGBuilder::lowerInvokable(llvm::TargetLowering::CallLoweringInfo&, llvm::BasicBlock const*) /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:6395:0
#11 0x00007f58ae861617 llvm::SelectionDAGBuilder::LowerCallTo(llvm::ImmutableCallSite, llvm::SDValue, bool, llvm::BasicBlock const*) /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:6508:0
#12 0x00007f58ae86415c llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&) /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:7058:0
[...]
Attaching the debugger showed the following:
The assertion happens while lowering %3 = call addrspace(1) { i8, i16 } @core_iter_LoopState_from_try(i16 %2)
(from calling I.dump()
in frame #12
above). The issue appears to be that while CLI.Ins
contains the types {i8, i16}
, InVals
contains the types {i16, i8}
(note the swapped order).
To me, that seems very connected to the proposed patch in this PR, so I'm reporting this here rather than in a new issue.
Looking at AVRISelLowering.cpp
, there appear to be a total of three places where arguments are reversed:
1) The LowerReturn
function handled by the patch.
2) The LowerCallResult
function which looks very similar to (1) but isn't patched.
3) The analyzeStandardArguments
which I have no idea about.
Applying the same patch in location (2) seems to fix the assertion failure I reported above. (Although now I'm hitting #101).
Here's a patch that generalises @brainlag's original patch to both locations including the one @TimNN noticed was missing.
commit 912503b74d3c6dde0aa259205e72ca7369545260
Author: Dylan McKay <me@dylanmckay.io>
Date: Sun Sep 2 14:20:48 2018 +1200
[AVR] Fix for avr-rust/rust#92
diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 57fc978b54b..a7b3bb7f155 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1298,6 +1298,35 @@ SDValue AVRTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
InVals);
}
+/// Reverse splitted return values to get the "big endian" format required
+/// to agree with the calling convention ABI.
+static void ReverseArgumentsToBigEndian(MachineFunction &MF,
+ SmallVector<CCValAssign, 16> &RVLocs) {
+ if (RVLocs.size() > 1) {
+ // Some hackery because SelectionDAGBuilder does not split
+ // up arguments properly
+ Type *retType = MF.getFunction().getReturnType();
+ if (retType->isStructTy()) {
+ if (retType->getNumContainedTypes() > 1 &&
+ retType->getNumContainedTypes() > RVLocs.size()) {
+ for (unsigned i = 0, pos = 0;
+ i < retType->getNumContainedTypes(); ++i) {
+ Type *field = retType->getContainedType(i);
+ if(field->isIntegerTy() && field->getIntegerBitWidth() > 16) {
+ int Size = field->getIntegerBitWidth() / 16;
+ std::reverse(RVLocs.begin()+ pos, RVLocs.end() + pos + Size);
+ pos += Size;
+ } else {
+ pos++;
+ }
+ }
+ }
+ } else {
+ std::reverse(RVLocs.begin(), RVLocs.end());
+ }
+ }
+}
+
/// Lower the result values of a call into the
/// appropriate copies out of appropriate physical registers.
///
@@ -1315,10 +1344,8 @@ SDValue AVRTargetLowering::LowerCallResult(
auto CCFunction = CCAssignFnForReturn(CallConv);
CCInfo.AnalyzeCallResult(Ins, CCFunction);
- if (CallConv != CallingConv::AVR_BUILTIN && RVLocs.size() > 1) {
- // Reverse splitted return values to get the "big endian" format required
- // to agree with the calling convention ABI.
- std::reverse(RVLocs.begin(), RVLocs.end());
+ if (CallConv != CallingConv::AVR_BUILTIN) {
+ ReverseArgumentsToBigEndian(DAG.getMachineFunction(), RVLocs);
}
// Copy all of the result registers out of their specified physreg.
@@ -1383,9 +1410,7 @@ AVRTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv,
// Reverse splitted return values to get the "big endian" format required
// to agree with the calling convention ABI.
- if (e > 1) {
- std::reverse(RVLocs.begin(), RVLocs.end());
- }
+ ReverseArgumentsToBigEndian(MF, RVLocs);
SDValue Flag;
SmallVector<SDValue, 4> RetOps(1, Chain);
I've been a bit slow with this because every time I started looking at the patch, I wasn't confident if it was breaking the calling convention or not, or if it'd be ABI-compatible with code generated by avr-gcc. One issue with this is the fact there is no automated integration testing for AVR - only some small repos I've setup, nothing automated. It would very easy to notice the bug @TimNN spotted if integration tests were testing the interaction between the different ISelLowering routines.
Integration tests are a bit tricky because avr-clang does not have rtlib linking logic (no CRT or libc is linked unless explicitly specified on the command line, including paths). That makes it hard to write tests because you'd kinda have to hardcode the computer-specific avr-gcc link library folders in the source files themselves. I've got a WIP patch that adds this linking logic though. The integration tests repository is located here.
I've done some more research and figured out that we are already incompatible with avr-gcc. I've written up BugZilla PR39251 to track this.
The avr-gcc calling convention docs state:
Return values with a size of 1 byte up to and including a size of 8 bytes will be returned in registers. Return values whose size is outside that range will be returned in memory.
I've done some experimentation with avr-gcc locally and noticed that integer types of 8 bytes or less will in fact be returned in registers, although contradictorily structure types, when returned, will only be returned in registers when they are of four bytes or less; the size limit if four for structs, 8 for integers.
Currently avr-llvm seems to always return structs on the stack.
Because of this, I think it's better to accept the patch and write a bug (EDIT: done in PR39251) for the avr-gcc incompatibility. I'm 99% certain the incompatibility was already there before Peter's patch (I am too lazy to checkout HEAD~1
and retest).
I need to write some more tests on the patch, hopefully get integration tests working.
I've written r39251 to track the avr-gcc incompatibility.
Current version of the patch, with tests and a bit of documentation.
ABI-compatible with code generated by avr-gcc
I've mentioned this before, but I as I understand this statement, it should be a non-goal. Rust already has the distinct notion of a rust
calling convention and a C
calling convention. The rust
calling convention should be whatever we can write that is most efficient; C
should match whatever clang produces, which presumably should attempt to match avr-gcc.
Ok, I think I understand better about how this works, which negates what I said before. Instead, it is correct to have the Clang + LLVM generate code that matches AVR-GCC.
Let me sum up what nagisa and Mark-Simulacrum told me:
"Rust"
calling convention actually gets mapped to LLVM's "C"
how the arguments are passed also matters for ABI purposes, but this is controlled by the compiler frontend. To get something compatible with C, the arguments must also be passed around in a specific manner. The "Rust"
calling convention does not ensure that it matches what C would do. For example:
pub extern "foo" fn foo(_: (i32, i32)) {}
// "Rust" => define void @foo(i32, i32)
// "C" => define void @foo(i64)
// From the playground (x86-64)
Differences could arise in how the Rust compiler with the Rust calling convention generates LLVM IR. For example, it might split up a single Rust argument into multiple LLVM IR arguments.
Well stated @shepmaster.
It does somewhat feel in vain given that I don't think anybody actually uses avr-clang in C projects, as the support is pretty lacking (for example, no crt or libgcc libraries are linked automatically).
I wonder if anybody has been using avr-gcc
in AVR cargo build scripts, linking avr-gcc objects, and using extern "C"
FFI. Regardless, it is important that we maintain compatibility.
A small update: the current patch from above, which reverses things in two locations, is actually incorrect. As was my earlier comment
Applying the same patch in location (2) seems to fix the assertion failure I reported above. (Although now I'm hitting #101).
The problem was that when I copied the patch to location (2), I did so incorrectly: Instead of "properly" reversing things, I messed up the code to never reverse anything (I initialised e
to RVLocs.size()
before the call to AnalyzeCallResult
, which means that e
was always 0
, and thus never > 1
).
You can find a working patch at https://github.com/TimNN/llvm/commit/71713c240836424023d57da7860ce623f27eab72
We've recently upgraded to LLVM 8 🎉 I'm going to close any bug that is reported against an older version of LLVM. If you are still having this issue with the LLVM-8 based code, please ping me and I can reopen the issue!
Reopening as this has a patch
I went to upstream this, but I noticed there are a few test failures (call.ll
, div.ll
, rem.ll
).
I made a few changes (added the new AVR doc to the docs index, fixed the failed tests so that they match what LLVM generates with the patch) and pushed it up onto dylanmckay/llvm@upstream-fix-avr-rust-92.
It looks like the patch changes the ABI, not just for structs, but also for indirect calls to function pointers returning sizes >16-bits.
The patch also changes the ABI of the libgcc
/compiler-rt
calls for division and remainder. You can see the changes in the below patch in the update to div.ll
and rem.ll
. As far as I can tell, this will either completely break div and mod or completely fix if it is the case it is already broken. It looks like the tests for sdiv32
et al have been around since late 2016, so it's likely they're correct.
commit 3c474f31c49643952326eae05fdfa2be4a7bff88
Author: Dylan McKay <me@dylanmckay.io>
Date: Mon Nov 5 19:00:11 2018 +1300
[AVR] Reverse split return values to into the "big endian" format required by the ABI
When returning structs, the fields need to be reversed. Without this,
the ABI of the generated executables is not compatible with avr-gcc,
nor will LLVM allow it to be compiled.
Here is the assertion message when complex structs are used:
Impossible reg-to-reg copy
UNREACHABLE executed at llvm/lib/Target/AVR/AVRInstrInfo.cpp!
This was first noticed in avr-rust/rust#92.
Description by Peter
The AVR couldn't handle returning structs like this at all (see avr-rust/rust#57).
It only expects to ever return an iXX and only when XX is >=32 it needs to reverse
the registers because of the calling convention. Now for this bug where we have
{ i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse
registers 1 and 2 but not 0.
Patch by Peter Nimmervoll (@brainlag) and Tim Neumann (@TimNN)
diff --git a/test/CodeGen/AVR/call.ll b/test/CodeGen/AVR/call.ll
index a2556e8c1e6..4a1de6eb603 100644
--- a/test/CodeGen/AVR/call.ll
+++ b/test/CodeGen/AVR/call.ll
@@ -172,15 +172,17 @@ define void @testcallprologue() {
define i32 @icall(i32 (i32)* %foo) {
; CHECK-LABEL: icall:
; CHECK: movw r30, r24
-; CHECK: ldi r22, 147
-; CHECK: ldi r23, 248
-; CHECK: ldi r24, 214
-; CHECK: ldi r25, 198
-; CHECK: icall
-; CHECK: subi r22, 251
-; CHECK: sbci r23, 255
-; CHECK: sbci r24, 255
-; CHECK: sbci r25, 255
+; CHECK-NEXT: ldi r22, 147
+; CHECK-NEXT: ldi r23, 248
+; CHECK-NEXT: ldi r24, 214
+; CHECK-NEXT: ldi r25, 198
+; CHECK-NEXT: icall
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: subi r24, 251
+; CHECK-NEXT: sbci r25, 255
+; CHECK-NEXT: sbci r18, 255
+; CHECK-NEXT: sbci r19, 255
+
%1 = call i32 %foo(i32 3335977107)
%2 = add nsw i32 %1, 5
ret i32 %2
@@ -200,10 +202,12 @@ define i32 @externcall(float %a, float %b) {
; CHECK: movw r20, [[REG1]]
; CHECK: call __divsf3
; CHECK: call foofloat
-; CHECK: subi r22, 251
-; CHECK: sbci r23, 255
-; CHECK: sbci r24, 255
+; CHECK: movw r18, r22
+; CHECK: subi r24, 251
; CHECK: sbci r25, 255
+; CHECK: sbci r18, 255
+; CHECK: sbci r19, 255
+
%1 = fdiv float %b, %a
%2 = call i32 @foofloat(float %1)
%3 = add nsw i32 %2, 5
diff --git a/test/CodeGen/AVR/div.ll b/test/CodeGen/AVR/div.ll
index 7626ecb8172..7ff8ae33456 100644
--- a/test/CodeGen/AVR/div.ll
+++ b/test/CodeGen/AVR/div.ll
@@ -44,8 +44,9 @@ define i16 @sdiv16(i16 %a, i16 %b) {
define i32 @udiv32(i32 %a, i32 %b) {
; CHECK-LABEL: udiv32:
; CHECK: call __udivmodsi4
-; CHECK-NEXT: movw r22, r18
-; CHECK-NEXT: movw r24, r20
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: movw r22, r24
+; CHECK-NEXT: movw r24, r18
; CHECK-NEXT: ret
%quot = udiv i32 %a, %b
ret i32 %quot
@@ -55,8 +56,9 @@ define i32 @udiv32(i32 %a, i32 %b) {
define i32 @sdiv32(i32 %a, i32 %b) {
; CHECK-LABEL: sdiv32:
; CHECK: call __divmodsi4
-; CHECK-NEXT: movw r22, r18
-; CHECK-NEXT: movw r24, r20
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: movw r22, r24
+; CHECK-NEXT: movw r24, r18
; CHECK-NEXT: ret
%quot = sdiv i32 %a, %b
ret i32 %quot
diff --git a/test/CodeGen/AVR/rem.ll b/test/CodeGen/AVR/rem.ll
index 47573e8dafc..eaa12b79774 100644
--- a/test/CodeGen/AVR/rem.ll
+++ b/test/CodeGen/AVR/rem.ll
@@ -42,6 +42,8 @@ define i16 @srem16(i16 %a, i16 %b) {
define i32 @urem32(i32 %a, i32 %b) {
; CHECK-LABEL: urem32:
; CHECK: call __udivmodsi4
+; CHECK: movw r22, r20
+; CHECK: movw r24, r18
; CHECK-NEXT: ret
%rem = urem i32 %a, %b
ret i32 %rem
@@ -51,6 +53,8 @@ define i32 @urem32(i32 %a, i32 %b) {
define i32 @srem32(i32 %a, i32 %b) {
; CHECK-LABEL: srem32:
; CHECK: call __divmodsi4
+; CHECK: movw r22, r20
+; CHECK: movw r24, r18
; CHECK-NEXT: ret
%rem = srem i32 %a, %b
ret i32 %rem
I have a suspicion
Functions with return types satisfying retType->getNumContainedTypes() > 1 && retType->getNumContainedTypes() > RVLocs.size()
are no longer reversed
Unsure if this is intentional.
diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 8b7282f0c71..169c8e5cda2 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1303,40 +1303,44 @@ SDValue AVRTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
static void ReverseArgumentsToBigEndian(MachineFunction &MF,
SmallVector<CCValAssign, 16> &RVLocs) {
if (RVLocs.size() > 1) {
// Some hackery because SelectionDAGBuilder does not split
// up arguments properly
Type *retType = MF.getFunction().getReturnType();
if (retType->isStructTy()) {
if (retType->getNumContainedTypes() > 1 &&
retType->getNumContainedTypes() > RVLocs.size()) {
for (unsigned i = 0, pos = 0;
i < retType->getNumContainedTypes(); ++i) {
Type *field = retType->getContainedType(i);
if(field->isIntegerTy() && field->getIntegerBitWidth() > 16) {
int Size = field->getIntegerBitWidth() / 16;
std::reverse(RVLocs.begin()+ pos, RVLocs.end() + pos + Size);
pos += Size;
} else {
pos++;
}
}
+ } else {
+ // FIXME: In the old version of this logic, this else branch
+ // would still reverse the return values. Old logic reversed if and
+ // always if RVLocs.size() > 1
+ // std::reverse(RVLocs.begin(), RVLocs.end()); ??
}
} else {
std::reverse(RVLocs.begin(), RVLocs.end());
}
}
}
So this means you hitting the struct path without having any structs?
It seems it, yeah. Unless there's something else going on like another location that needs to be reverse'd. I saw one last std::reverse
call, but it was related to function arguments. I am unsure if that could be the problem though.
This is a very old thread. But I'm joining in. 😄 The stuff in here is mainlined in avr-llvm and now we're getting some fairly serious breaks that are affecting me. #130 seems to be caused by this. I think when you just get a normal 32 bit return value the words are getting swapped. I'm creating unit tests for this and I'll try to attach them here and to #130. But for example in a really simple case, if you have a function that returns a 32 bit int and at the end just calls another external function that returns a 32 bit int, it should have no statements between the call XXX and ret, because the value should be returned in r22-r25 from the extern function and then returned back immediately. In all cases it thinks things are the wrong way round so it swaps r22/r23 with r24/r25. That breaks the rem.ll, div.ll and call.ll tests in exactly that way. Plus it breaks swift for Arduino because my code was doing int16 multiplications and returning 0 every time, real world impact. lol.
Anyway, it's going to take me a long time to understand what's going on in this thread, let alone with the code because I'm such a n00b, @brainlag @shepmaster @dylanmckay you guys want to chip in?
OK, think I've found what's causing all the problems @dylanmckay. In various bits of consolidation, etc. code got accidentally deleted from LowerCallResult
.
The patch should have been this:
@@ -1315,10 +1344,8 @@ SDValue AVRTargetLowering::LowerCallResult(
auto CCFunction = CCAssignFnForReturn(CallConv);
CCInfo.AnalyzeCallResult(Ins, CCFunction);
- if (CallConv != CallingConv::AVR_BUILTIN && RVLocs.size() > 1) {
- // Reverse splitted return values to get the "big endian" format required
- // to agree with the calling convention ABI.
- std::reverse(RVLocs.begin(), RVLocs.end());
+ if (CallConv != CallingConv::AVR_BUILTIN) {
+ ReverseArgumentsToBigEndian(DAG.getMachineFunction(), RVLocs);
}
But ended up as this:
auto CCFunction = CCAssignFnForReturn(CallConv);
CCInfo.AnalyzeCallResult(Ins, CCFunction);
- if (CallConv != CallingConv::AVR_BUILTIN && RVLocs.size() > 1) {
- // Reverse splitted return values to get the "big endian" format required
- // to agree with the calling convention ABI.
- std::reverse(RVLocs.begin(), RVLocs.end());
- }
So ReverseArgumentsToBigEndian
is never called from LowerCallResult
. I'm pretty sure this is just an accident. I'll do a small PR over the weekend with some unit tests.
Background, I think the way the code was working was this:
For results coming back from calls or being returned from a function that are i8 or i16, the standard register mapping works. This will give you r24 or r24/r25 as a pair.
If you had a result coming back from a call or function that was i32 or i64, SelectionDAGBuilder would ask "how many registers and what types will we use?" and get back "2 registers of i16" (or "4 registers of i16")
These would be naively assigned using RetCC_AVR
from AVRCallingConv.td
as register 0 (the low 16 bits): r24/r25, register 1 (the high 16 bits): r22/r23, which is word swapped. To fix that, if there were more than two registers in a return value, the old code reversed them and they got assigned the other way around so it all worked, but only for returning simple scalar types. I'm not sure they had ever got struct type returns working. If this was largely unchanged from the old code on Source Forge, that code would have probably only ever been tested on LLVM IR from clang, which probably wouldn't often (ever?) produce a struct return.
OK, I've come up with a patch: https://gist.github.com/carlos4242/f882c9f1213ef006feb6984acef291d7
Built the official 8.0 release code with AVR target enabled, llc
ed the code in OP, and llc
segfaulted.
This bug occurs in my LLVM 6.0 branch at #91 when I compile stock libcore.
Note that this is coming from
@_ZN3lib3num7flt2dec8strategy5grisu22max_pow10_no_more_than17h83b3c65a945b5158E
- the Grisu float-string algorithm.I have minimised down to a testcase, and also collected
llc
debug outputsYou can see the build artifacts here.