UNREACHABLE Impossible reg-to-reg copy

dylanmckay commented 6 years ago

This bug occurs in my LLVM 6.0 branch at #91 when I compile stock libcore.

Impossible reg-to-reg copy
UNREACHABLE executed at /home/dylan/projects/llvm-project/llvm/lib/Target/AVR/AVRInstrInfo.cpp:75!
#0 0x00007fd02d6f8cc6 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/dylan/projects/llvm-project/llvm/lib/Support/Unix/Signals.inc:398:11
#1 0x00007fd02d6f8ec9 PrintStackTraceSignalHandler(void*) /home/dylan/projects/llvm-project/llvm/lib/Support/Unix/Signals.inc:462:1
#2 0x00007fd02d6f7260 llvm::sys::RunSignalHandlers() /home/dylan/projects/llvm-project/llvm/lib/Support/Signals.cpp:0:5
#3 0x00007fd02d6f92ba SignalHandler(int) /home/dylan/projects/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
#4 0x00007fd02c5d6dd0 __restore_rt (/usr/lib/libpthread.so.0+0x11dd0)
#5 0x00007fd02b958860 __GI_raise (/usr/lib/libc.so.6+0x34860)
#6 0x00007fd02b959ec9 __GI_abort (/usr/lib/libc.so.6+0x35ec9)
#7 0x00007fd02d5f71f0 llvm::install_out_of_memory_new_handler() /home/dylan/projects/llvm-project/llvm/lib/Support/ErrorHandling.cpp:193:0
#8 0x00007fd032637599 llvm::AVRInstrInfo::copyPhysReg(llvm::MachineBasicBlock&, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::DebugLoc const&, unsigned int, unsigned int, bool) const /home/dylan/projects/llvm-project/llvm/lib/Target/AVR/AVRInstrInfo.cpp:0:7
#9 0x00007fd03004dd3b (anonymous namespace)::ExpandPostRA::LowerCopy(llvm::MachineInstr*) /home/dylan/projects/llvm-project/llvm/lib/CodeGen/ExpandPostRAPseudos.cpp:166:7
#10 0x00007fd03004d189 (anonymous namespace)::ExpandPostRA::runOnMachineFunction(llvm::MachineFunction&) /home/dylan/projects/llvm-project/llvm/lib/CodeGen/ExpandPostRAPseudos.cpp:212:23
#11 0x00007fd030230e61 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/dylan/projects/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:62:8
#12 0x00007fd02f76fdff llvm::FPPassManager::runOnFunction(llvm::Function&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1520:27
#13 0x00007fd02f770152 llvm::FPPassManager::runOnModule(llvm::Module&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1541:16
#14 0x00007fd02f7709b0 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1597:27
#15 0x00007fd02f770436 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1700:16
#16 0x00007fd02f770f21 llvm::legacy::PassManager::run(llvm::Module&) /home/dylan/projects/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1731:3
#17 0x00005579e90cf34a compileModule(char**, llvm::LLVMContext&) /home/dylan/projects/llvm-project/llvm/tools/llc/llc.cpp:577:41
#18 0x00005579e90cd800 main /home/dylan/projects/llvm-project/llvm/tools/llc/llc.cpp:347:13
#19 0x00007fd02b944f4a __libc_start_main (/usr/lib/libc.so.6+0x20f4a)
#20 0x00005579e90cceea _start (./bin/llc+0x22eea)
Stack dump:
0.  Program arguments: ./bin/llc -march=avr -mcpu=atmega328p bugpoint-reduced-simplified.ll -o /dev/null 
1.  Running pass 'Function Pass Manager' on module 'bugpoint-reduced-simplified.ll'.
2. Running pass 'Post-RA pseudo instruction expansion pass' on function '@_ZN3lib3num7flt2dec8strategy5grisu22max_pow10_no_more_than17h83b3c65a945b5158E'

Note that this is coming from @_ZN3lib3num7flt2dec8strategy5grisu22max_pow10_no_more_than17h83b3c65a945b5158E - the Grisu float-string algorithm.

I have minimised down to a testcase, and also collected llc debug outputs

target triple = "avr-unknown-unknown"
target datalayout = "e-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"

define { i8, i32 } @chapstick() {
start:
  ret { i8, i32 } zeroinitializer
}

You can see the build artifacts here.

shepmaster commented 6 years ago

lib::num::flt2dec::strategy

In our own libcore, we've disabled a lot of this code because it's simply too big. This may not be an extra blocker from moving to LLVM 6.

brainlag commented 6 years ago

Do I understand this code correct. It tries to return a struct with the fields initialized to zero?

And if this is the machine code:

bb.0.start:
  %2:ld8 = LDIRdK 0
  %3:dldregs = LDIWRdK 0
  $r21r20 = COPY %2
  $r23r22 = COPY %3
  $r24 = COPY %3
  RET implicit $r21r20, implicit $r23r22, implicit $r24

Then the for some reason the copy instructions for 16bit registers and 8bit registers get mixed up. This doesn't sound to hard to fix.

shepmaster commented 6 years ago

It tries to return a struct with the fields initialized to zero?

That seems like it to me!

This doesn't sound to hard to fix.

Awesome!

brainlag commented 6 years ago

~~~I think I found it. In AVRTargetLowering::LowerReturn the destination of the copy is reversed because of the calling convention but not the source.~~~



~~~But here we go [gist](https://gist.github.com/brainlag/309d47cd78aedc081773d7e14a65468d):~~~

```diff
diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 7ac8a136e6b..4f7d3d6a815 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1371,16 +1371,12 @@ AVRTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv,
   MachineFunction &MF = DAG.getMachineFunction();
   unsigned e = RVLocs.size();

-  // Reverse splitted return values to get the "big endian" format required
-  // to agree with the calling convention ABI.
-  if (e > 1) {
-    std::reverse(RVLocs.begin(), RVLocs.end());
-  }
-
   SDValue Flag;
   SmallVector<SDValue, 4> RetOps(1, Chain);
   // Copy the result values into the output registers.
-  for (unsigned i = 0; i != e; ++i) {
+  // Reverse loop to get the "big endian" format required
+  // to agree with the calling convention ABI.
+  for (int i = (e - 1); i < 0; --i) {
     CCValAssign &VA = RVLocs[i];
     assert(VA.isRegLoc() && "Can only return in registers!");

diff --git a/test/CodeGen/AVR/rust-avr-bug-92.ll b/test/CodeGen/AVR/rust-avr-bug-92.ll
new file mode 100644
index 00000000000..f93bf7a152d
--- /dev/null
+++ b/test/CodeGen/AVR/rust-avr-bug-92.ll
@@ -0,0 +1,6 @@
+; RUN: llc < %s -march=avr | FileCheck %s
+
+define { i8, i32 } @chapstick() {
+start:
+  ret { i8, i32 } zeroinitializer
+}
\ No newline at end of file
```

shepmaster commented 6 years ago

Can you narrow down what exactly the problem is? Is it just that the condition (if (e > 1)) has been removed? Is it that the call to std::reverse is "broken"?

I wonder if it would still work if the only change was removing if (e > 1) {.

brainlag commented 6 years ago

Damn, I should have run the tests before. It breaks a lot of stuff. It made so much sense.

brainlag commented 6 years ago

I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57). It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.

brainlag commented 6 years ago

~~I don't know if this produces correct output but it prevents the crash. This patch just avoids to return a struct through registers. Which only works for a few/small structs anyway.~~

https://gist.github.com/brainlag/4d9217eefcac1f06df682415975d97c3

diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp index 7ac8a136e6b..e0bb2f317d1 100644 --- a/lib/Target/AVR/AVRISelLowering.cpp +++ b/lib/Target/AVR/AVRISelLowering.cpp @@ -1346,7 +1346,7 @@ AVRTargetLowering::CanLowerReturn(CallingConv::ID CallConv, CCState CCInfo(CallConv, isVarArg, MF, RVLocs, Context); auto CCFunction = CCAssignFnForReturn(CallConv); - return CCInfo.CheckReturn(Outs, CCFunction); + return !MF.getFunction().getReturnType()->isStructTy() && CCInfo.CheckReturn(Outs, CCFunction); } SDValue diff --git a/test/CodeGen/AVR/rust-avr-bug-92.ll b/test/CodeGen/AVR/rust-avr-bug-92.ll new file mode 100644 index 00000000000..e76ce630c67 --- /dev/null +++ b/test/CodeGen/AVR/rust-avr-bug-92.ll @@ -0,0 +1,7 @@ +; RUN: llc < %s -march=avr | FileCheck %s + +; CHECK-LABEL: main +define { i8, i32 } @main() { +start: + ret { i8, i32 } zeroinitializer +} \ No newline at end of file

dylanmckay commented 6 years ago

I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57). It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.

This makes a lot of sense. Sounds like we need to adjust this logic and write some more calling convention tests.

I don't know if this produces correct output but it prevents the crash. This patch just avoids to return a struct through registers. Which only works for a few/small structs anyway.

I suspect that this patch will break the calling convention for all functions which return structs, as they will now always be passed on the stack. FWIW, code compiled with this patch will work with code compiled under the same patch, but it would not work when compiling against objects from an older version of LLVM or AVR-GCC generated objects.

I am fairly busy the next handful of days, but after that this bug is at the top of my priority list. If anybody wants to look at this in the interim (please do!), feel free to ask questions on this issue, or ping me on Gitter.

dylanmckay commented 6 years ago

In the past to fix calling convention issues like this, I've taken a C file and compared it side-by-side with LLVM/GCC compiled code.

I've found that this is the best way to make sure our calling convention implementation is accurate. It looks like we are missing struct-return tests, which would be good to add.

brainlag commented 6 years ago

I think I know whats going on. The AVR couldn't handle returning structs like this at all (see #57). It only expects to ever return an iXX and only when XX is >=32 it needs to reverse the registers because of the calling convention. Now for this bug where we have { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse registers 1 and 2 but not 0.

This makes a lot of sense. Sounds like we need to adjust this logic and write some more calling convention tests.

I have code for this, but it felt like a ugly hack to me and after testing with a couple of different structs I found it really hard to find structs which don't use the stack. If a struct doesn't fit within 3 (or maybe for 4?) registered it used the stack even without this patch.

I update with examples later.

brainlag commented 6 years ago

I tried it again. If the struct is <= 64 bit you can pass it without the stack.

Examples

```llvm ; RUN: llc < %s -march=avr | FileCheck %s ; CHECK-LABEL: retstruct1 define { i8, i32 } @retstruct1() { start: ret { i8, i32 } zeroinitializer } ; CHECK-LABEL: retstruct2 define { i16, i32 } @retstruct2() { start: ret { i16, i32 } zeroinitializer } ; CHECK-LABEL: retstruct3 define { i32, i32 } @retstruct3() { start: ret { i32, i32 } zeroinitializer } ; CHECK-LABEL: retstruct4 define { i8, i64 } @retstruct4() { start: ret { i8, i64 } zeroinitializer } ; CHECK-LABEL: retstruct5 define { i8, i32, i16 } @retstruct5() { start: ret { i8, i32, i16 } zeroinitializer } ; CHECK-LABEL: retstruct6 define { i8, i32, i32 } @retstruct6() { start: ret { i8, i32, i32 } zeroinitializer } ```

The resulting asm code

```asm .text .file "rust-avr-bug-92.1.ll" .globl retstruct1 ; -- Begin function retstruct1 .p2align 1 .type retstruct1,@function retstruct1: ; @retstruct1 ; %bb.0: ; %start ldi r22, 0 ldi r23, 0 ldi r24, 0 mov r20, r22 mov r21, r23 ret .Lfunc_end0: .size retstruct1, .Lfunc_end0-retstruct1 ; -- End function .globl retstruct2 ; -- Begin function retstruct2 .p2align 1 .type retstruct2,@function retstruct2: ; @retstruct2 ; %bb.0: ; %start ldi r24, 0 ldi r25, 0 mov r22, r24 mov r23, r25 mov r20, r24 mov r21, r25 ret .Lfunc_end1: .size retstruct2, .Lfunc_end1-retstruct2 ; -- End function .globl retstruct3 ; -- Begin function retstruct3 .p2align 1 .type retstruct3,@function retstruct3: ; @retstruct3 ; %bb.0: ; %start ldi r24, 0 ldi r25, 0 mov r22, r24 mov r23, r25 mov r20, r24 mov r21, r25 mov r18, r24 mov r19, r25 ret .Lfunc_end2: .size retstruct3, .Lfunc_end2-retstruct3 ; -- End function .globl retstruct4 ; -- Begin function retstruct4 .p2align 1 .type retstruct4,@function retstruct4: ; @retstruct4 ; %bb.0: ; %start mov r30, r24 mov r31, r25 ldi r24, 0 ldi r25, 0 std Z+7, r24 std Z+8, r25 std Z+5, r24 std Z+6, r25 std Z+3, r24 std Z+4, r25 ldi r18, 0 st Z+, r18 st Z, r24 std Z+1, r25 ret .Lfunc_end3: .size retstruct4, .Lfunc_end3-retstruct4 ; -- End function .globl retstruct5 ; -- Begin function retstruct5 .p2align 1 .type retstruct5,@function retstruct5: ; @retstruct5 ; %bb.0: ; %start ldi r22, 0 ldi r23, 0 ldi r24, 0 mov r20, r22 mov r21, r23 mov r18, r22 mov r19, r23 ret .Lfunc_end4: .size retstruct5, .Lfunc_end4-retstruct5 ; -- End function .globl retstruct6 ; -- Begin function retstruct6 .p2align 1 .type retstruct6,@function retstruct6: ; @retstruct6 ; %bb.0: ; %start mov r30, r24 mov r31, r25 ldi r24, 0 ldi r25, 0 std Z+7, r24 std Z+8, r25 std Z+5, r24 std Z+6, r25 std Z+3, r24 std Z+4, r25 ldi r18, 0 st Z+, r18 st Z, r24 std Z+1, r25 ret .Lfunc_end5: .size retstruct6, .Lfunc_end5-retstruct6 ; -- End function ; Declaring this symbol tells the CRT that it should ;copy all variables from program memory to RAM on startup .globl __do_copy_data ; Declaring this symbol tells the CRT that it should ;clear the zeroed data section on startup .globl __do_clear_bss ```

The patch which fixes the register ordering

```diff diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp index e0bb2f317d1..4dc7efaa51c 100644 --- a/lib/Target/AVR/AVRISelLowering.cpp +++ b/lib/Target/AVR/AVRISelLowering.cpp @@ -1374,7 +1374,24 @@ AVRTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv, // Reverse splitted return values to get the "big endian" format required // to agree with the calling convention ABI. if (e > 1) { - std::reverse(RVLocs.begin(), RVLocs.end()); + // some hackery because SelectionDAGBuilder does not split up arguments properly + Type *retType = MF.getFunction().getReturnType(); + if (retType->isStructTy()) { + if (retType->getNumContainedTypes() > 1 && retType->getNumContainedTypes() > e) { + for (unsigned i = 0, pos = 0; i < retType->getNumContainedTypes(); ++i) { + Type *field = retType->getContainedType(i); + if(field->isIntegerTy() && field->getIntegerBitWidth() > 16) { + int Size = field->getIntegerBitWidth() / 16; + std::reverse(RVLocs.begin()+ pos, RVLocs.end() + pos + Size); + pos += Size; + } else { + pos++; + } + } + } + } else { + std::reverse(RVLocs.begin(), RVLocs.end()); + } } SDValue Flag; ```

dylanmckay commented 6 years ago

Great work!

I will review the patch in the next few days.

jhwgh1968 commented 6 years ago

Since @dylanmckay has been busy, and I've been trying to get engaged in this projct, I applied the patch and tried to compile libcore.

After working around #95 by lowering the optimization level, I got a new crash:

rustc: /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8736: void llvm::SelectionDAGISel::LowerArguments(const llvm::Function&): Assertion `InVals.size() == Ins.size() && "LowerFormalArguments didn't emit the correct number of values!"' failed.
(gdb) bt
#0  0x00007ffff71f8860 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff71f9ec9 in abort () from /usr/lib/libc.so.6
#2  0x00007ffff71f10bc in __assert_fail_base () from /usr/lib/libc.so.6
#3  0x00007ffff71f1133 in __assert_fail () from /usr/lib/libc.so.6
#4  0x00007fffeca26c14 in llvm::SelectionDAGISel::LowerArguments (
    this=this@entry=0x7fffb61d9000, F=...)
    at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8735
#5  0x00007fffeca77d6e in llvm::SelectionDAGISel::SelectAllBasicBlocks (
    this=this@entry=0x7fffb61d9000, Fn=...)
    at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:1402
#6  0x00007fffeca78d72 in llvm::SelectionDAGISel::runOnMachineFunction (
    this=<optimized out>, mf=...)
    at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:466
#7  0x00007fffecce12c5 in llvm::MachineFunctionPass::runOnFunction (
    this=0x7fffb61d9000, F=...)
    at /home/admin/Software/avr-rust/src/llvm/lib/CodeGen/MachineFunctionPass.cpp:62
(gdb) l 8735
8730          DAG.getRoot(), F.getCallingConv(), F.isVarArg(), Ins, dl, DAG, InVals);
8731
8732      // Verify that the target's LowerFormalArguments behaved as expected.
8733      assert(NewRoot.getNode() && NewRoot.getValueType() == MVT::Other &&
8734             "LowerFormalArguments didn't return a valid chain!");
8735      assert(InVals.size() == Ins.size() &&
8736             "LowerFormalArguments didn't emit the correct number of values!");
8737      DEBUG({
8738          for (unsigned i = 0, e = Ins.size(); i != e; ++i) {
8739            assert(InVals[i].getNode() &&

I poked around a bit, but as ignorant as I am about LLVM, it wasn't very insightful.

EDIT: I guess one thing is worth adding. If I counted fields correctly, the length of InVals is 8, while the length of Ins is 16.

shepmaster commented 6 years ago

as ignorant as I am about LLVM

Welcome to the club! 😸

jhwgh1968 commented 6 years ago

Welcome to the club! :smile_cat:

Thanks! :smile:

In that case, does @brainlag have any more ideas?

brainlag commented 6 years ago

Are you sure picked up the patch for #57?

jhwgh1968 commented 6 years ago

I just checked, @brainlag, and the patch for #57 is the very last commit I have. I am using @shepmaster's rebase here.

brainlag commented 6 years ago

Ok, you should pin the problem down by making a reduced testcase. Then create a new bug with the LLVM IR code that triggers the bug.

jhwgh1968 commented 6 years ago

Whoops! It seems that I was using the wrong version of llc. Nevermind.

Thanks for the link about reduced testcases, though. I always wondered how those got generated. :smile:

shepmaster commented 6 years ago

@brainlag to double-check: should both this patch and this patch be applied, or does the second one replace the first?

brainlag commented 6 years ago

Only the second one.

shepmaster commented 6 years ago

Only the second one.

Cool. I've crossed out the first one to help in the future.

dylanmckay commented 6 years ago

The current patch causes a few test failures.

I fixed the ones in add.ll and sub.ll in this patch. Two other tests are failing.

The tests are failing because registers are changing. I am unsure if the function calling convention argument registers have changed. The two tests I fixed up made the IR significantly uglier (see add+sub).

brainlag commented 6 years ago

IIRC this patch did not break any tests. There must be something else going on.

dylanmckay commented 6 years ago

I suspect these two tests are existing failures - I'm going to check again.

leo60228 commented 6 years ago

Is there an update on this? If this hasn't been fixed, is this a good problem for a beginner (partially to Rust and completely to LLVM)?

brainlag commented 6 years ago

Yes, this should be good, @dylanmckay just need to verify it and upstream it.

TimNN commented 6 years ago

Hi!

I recently tried to build avr-rust at the current Rust HEAD and attempted to build stock libcore. (I'm not really sure if that is supposed to work or not). After patching rustc to handle separate program-data address spaces (I'll send a PR for that ~soon), I ran into the impossible reg-to-reg copy issue. I applied the "current patch" referenced above, however that still fails with:

llc: /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8603: std::pair<llvm::SDValue, llvm::SDValue> llvm::TargetLowering::LowerCallTo(llvm::TargetLowering::CallLoweringInfo&) const: Assertion `EVT(CLI.Ins[i].VT) == InVals[i].getValueType() && "LowerCall emitted a value with the wrong type!"' failed.
Stack dump:
0.  Program arguments: llc min0.ll
1.  Running pass 'Function Pass Manager' on module 'min0.ll'.
2.  Running pass 'AVR DAG->DAG Instruction Selection' on function '@core_iter_TakeWhile_try_fold_closure'

[...]

#9 0x00007f58ae86dcb9 llvm::TargetLowering::LowerCallTo(llvm::TargetLowering::CallLoweringInfo&) const /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:8602:0
#10 0x00007f58ae860d25 llvm::SelectionDAGBuilder::lowerInvokable(llvm::TargetLowering::CallLoweringInfo&, llvm::BasicBlock const*) /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:6395:0
#11 0x00007f58ae861617 llvm::SelectionDAGBuilder::LowerCallTo(llvm::ImmutableCallSite, llvm::SDValue, bool, llvm::BasicBlock const*) /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:6508:0
#12 0x00007f58ae86415c llvm::SelectionDAGBuilder::visitCall(llvm::CallInst const&) /home/logic/avr/src/src/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp:7058:0

[...]

Reduced Testcase

Attaching the debugger showed the following:

The assertion happens while lowering %3 = call addrspace(1) { i8, i16 } @core_iter_LoopState_from_try(i16 %2) (from calling I.dump() in frame #12 above). The issue appears to be that while CLI.Ins contains the types {i8, i16}, InVals contains the types {i16, i8} (note the swapped order).

To me, that seems very connected to the proposed patch in this PR, so I'm reporting this here rather than in a new issue.

TimNN commented 6 years ago

Looking at AVRISelLowering.cpp, there appear to be a total of three places where arguments are reversed:

1) The LowerReturn function handled by the patch. 2) The LowerCallResult function which looks very similar to (1) but isn't patched. 3) The analyzeStandardArguments which I have no idea about.

TimNN commented 6 years ago

Applying the same patch in location (2) seems to fix the assertion failure I reported above. (Although now I'm hitting #101).

dylanmckay commented 6 years ago

Here's a patch that generalises @brainlag's original patch to both locations including the one @TimNN noticed was missing.

commit 912503b74d3c6dde0aa259205e72ca7369545260
Author: Dylan McKay <me@dylanmckay.io>
Date:   Sun Sep 2 14:20:48 2018 +1200

    [AVR] Fix for avr-rust/rust#92

diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 57fc978b54b..a7b3bb7f155 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1298,6 +1298,35 @@ SDValue AVRTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
                          InVals);
 }

+/// Reverse splitted return values to get the "big endian" format required
+/// to agree with the calling convention ABI.
+static void ReverseArgumentsToBigEndian(MachineFunction &MF,
+                                        SmallVector<CCValAssign, 16> &RVLocs) {
+  if (RVLocs.size() > 1) {
+    // Some hackery because SelectionDAGBuilder does not split
+    // up arguments properly
+    Type *retType = MF.getFunction().getReturnType();
+    if (retType->isStructTy()) {
+      if (retType->getNumContainedTypes() > 1 &&
+          retType->getNumContainedTypes() > RVLocs.size()) {
+        for (unsigned i = 0, pos = 0;
+            i < retType->getNumContainedTypes(); ++i) {
+          Type *field = retType->getContainedType(i);
+          if(field->isIntegerTy() && field->getIntegerBitWidth() > 16) {
+            int Size = field->getIntegerBitWidth() / 16;
+            std::reverse(RVLocs.begin()+ pos, RVLocs.end() + pos + Size);
+            pos += Size;
+          } else {
+            pos++;
+          }
+        }
+      }
+    } else {
+      std::reverse(RVLocs.begin(), RVLocs.end());
+    }
+  }
+}
+
 /// Lower the result values of a call into the
 /// appropriate copies out of appropriate physical registers.
 ///
@@ -1315,10 +1344,8 @@ SDValue AVRTargetLowering::LowerCallResult(
   auto CCFunction = CCAssignFnForReturn(CallConv);
   CCInfo.AnalyzeCallResult(Ins, CCFunction);

-  if (CallConv != CallingConv::AVR_BUILTIN && RVLocs.size() > 1) {
-    // Reverse splitted return values to get the "big endian" format required
-    // to agree with the calling convention ABI.
-    std::reverse(RVLocs.begin(), RVLocs.end());
+  if (CallConv != CallingConv::AVR_BUILTIN) {
+    ReverseArgumentsToBigEndian(DAG.getMachineFunction(), RVLocs);
   }

   // Copy all of the result registers out of their specified physreg.
@@ -1383,9 +1410,7 @@ AVRTargetLowering::LowerReturn(SDValue Chain, CallingConv::ID CallConv,

   // Reverse splitted return values to get the "big endian" format required
   // to agree with the calling convention ABI.
-  if (e > 1) {
-    std::reverse(RVLocs.begin(), RVLocs.end());
-  }
+  ReverseArgumentsToBigEndian(MF, RVLocs);

   SDValue Flag;
   SmallVector<SDValue, 4> RetOps(1, Chain);

dylanmckay commented 6 years ago

I've been a bit slow with this because every time I started looking at the patch, I wasn't confident if it was breaking the calling convention or not, or if it'd be ABI-compatible with code generated by avr-gcc. One issue with this is the fact there is no automated integration testing for AVR - only some small repos I've setup, nothing automated. It would very easy to notice the bug @TimNN spotted if integration tests were testing the interaction between the different ISelLowering routines.

Integration tests are a bit tricky because avr-clang does not have rtlib linking logic (no CRT or libc is linked unless explicitly specified on the command line, including paths). That makes it hard to write tests because you'd kinda have to hardcode the computer-specific avr-gcc link library folders in the source files themselves. I've got a WIP patch that adds this linking logic though. The integration tests repository is located here.

I've done some more research and figured out that we are already incompatible with avr-gcc. I've written up BugZilla PR39251 to track this.

The avr-gcc calling convention docs state:

Return values with a size of 1 byte up to and including a size of 8 bytes will be returned in registers. Return values whose size is outside that range will be returned in memory.

avr-gcc wiki

I've done some experimentation with avr-gcc locally and noticed that integer types of 8 bytes or less will in fact be returned in registers, although contradictorily structure types, when returned, will only be returned in registers when they are of four bytes or less; the size limit if four for structs, 8 for integers.

Currently avr-llvm seems to always return structs on the stack.

Because of this, I think it's better to accept the patch and write a bug (EDIT: done in PR39251) for the avr-gcc incompatibility. I'm 99% certain the incompatibility was already there before Peter's patch (I am too lazy to checkout HEAD~1 and retest).

I need to write some more tests on the patch, hopefully get integration tests working.

dylanmckay commented 6 years ago

I've written r39251 to track the avr-gcc incompatibility.

dylanmckay commented 6 years ago

Current version of the patch, with tests and a bit of documentation.

patch.diff

shepmaster commented 6 years ago

ABI-compatible with code generated by avr-gcc

I've mentioned this before, but I as I understand this statement, it should be a non-goal. Rust already has the distinct notion of a rust calling convention and a C calling convention. The rust calling convention should be whatever we can write that is most efficient; C should match whatever clang produces, which presumably should attempt to match avr-gcc.

shepmaster commented 6 years ago

Ok, I think I understand better about how this works, which negates what I said before. Instead, it is correct to have the Clang + LLVM generate code that matches AVR-GCC.

Let me sum up what nagisa and Mark-Simulacrum told me:

the "Rust" calling convention actually gets mapped to LLVM's "C"
how the arguments are passed also matters for ABI purposes, but this is controlled by the compiler frontend. To get something compatible with C, the arguments must also be passed around in a specific manner. The "Rust" calling convention does not ensure that it matches what C would do. For example:
```
pub extern "foo" fn foo(_: (i32, i32)) {}
// "Rust" => define void @foo(i32, i32)
// "C"    => define void @foo(i64)
// From the playground (x86-64)
```

Differences could arise in how the Rust compiler with the Rust calling convention generates LLVM IR. For example, it might split up a single Rust argument into multiple LLVM IR arguments.

dylanmckay commented 6 years ago

Well stated @shepmaster.

It does somewhat feel in vain given that I don't think anybody actually uses avr-clang in C projects, as the support is pretty lacking (for example, no crt or libgcc libraries are linked automatically).

I wonder if anybody has been using avr-gcc in AVR cargo build scripts, linking avr-gcc objects, and using extern "C" FFI. Regardless, it is important that we maintain compatibility.

TimNN commented 6 years ago

A small update: the current patch from above, which reverses things in two locations, is actually incorrect. As was my earlier comment

Applying the same patch in location (2) seems to fix the assertion failure I reported above. (Although now I'm hitting #101).

The problem was that when I copied the patch to location (2), I did so incorrectly: Instead of "properly" reversing things, I messed up the code to never reverse anything (I initialised e to RVLocs.size() before the call to AnalyzeCallResult, which means that e was always 0, and thus never > 1).

You can find a working patch at https://github.com/TimNN/llvm/commit/71713c240836424023d57da7860ce623f27eab72

shepmaster commented 6 years ago

We've recently upgraded to LLVM 8 🎉 I'm going to close any bug that is reported against an older version of LLVM. If you are still having this issue with the LLVM-8 based code, please ping me and I can reopen the issue!

shepmaster commented 6 years ago

Reopening as this has a patch

dylanmckay commented 6 years ago

I went to upstream this, but I noticed there are a few test failures (call.ll, div.ll, rem.ll).

I made a few changes (added the new AVR doc to the docs index, fixed the failed tests so that they match what LLVM generates with the patch) and pushed it up onto dylanmckay/llvm@upstream-fix-avr-rust-92.

It looks like the patch changes the ABI, not just for structs, but also for indirect calls to function pointers returning sizes >16-bits.

The patch also changes the ABI of the libgcc/compiler-rt calls for division and remainder. You can see the changes in the below patch in the update to div.ll and rem.ll. As far as I can tell, this will either completely break div and mod or completely fix if it is the case it is already broken. It looks like the tests for sdiv32 et al have been around since late 2016, so it's likely they're correct.

commit 3c474f31c49643952326eae05fdfa2be4a7bff88
Author: Dylan McKay <me@dylanmckay.io>
Date:   Mon Nov 5 19:00:11 2018 +1300

    [AVR] Reverse split return values to into the "big endian" format required by the ABI

    When returning structs, the fields need to be reversed. Without this,
    the ABI of the generated executables is not compatible with avr-gcc,
    nor will LLVM allow it to be compiled.

    Here is the assertion message when complex structs are used:

        Impossible reg-to-reg copy
        UNREACHABLE executed at llvm/lib/Target/AVR/AVRInstrInfo.cpp!

    This was first noticed in avr-rust/rust#92.

    Description by Peter

        The AVR couldn't handle returning structs like this at all (see avr-rust/rust#57).
        It only expects to ever return an iXX and only when XX is >=32 it needs to reverse
        the registers because of the calling convention. Now for this bug where we have
        { i8, i32 } and registers {[0] = i8, [1] = i16, [2] = i16} it only should reverse
        registers 1 and 2 but not 0.

    Patch by Peter Nimmervoll (@brainlag) and Tim Neumann (@TimNN)

diff --git a/test/CodeGen/AVR/call.ll b/test/CodeGen/AVR/call.ll
index a2556e8c1e6..4a1de6eb603 100644
--- a/test/CodeGen/AVR/call.ll
+++ b/test/CodeGen/AVR/call.ll
@@ -172,15 +172,17 @@ define void @testcallprologue() {
 define i32 @icall(i32 (i32)* %foo) {
 ; CHECK-LABEL: icall:
 ; CHECK: movw r30, r24
-; CHECK: ldi r22, 147
-; CHECK: ldi r23, 248
-; CHECK: ldi r24, 214
-; CHECK: ldi r25, 198
-; CHECK: icall
-; CHECK: subi r22, 251
-; CHECK: sbci r23, 255
-; CHECK: sbci r24, 255
-; CHECK: sbci r25, 255
+; CHECK-NEXT: ldi r22, 147
+; CHECK-NEXT: ldi r23, 248
+; CHECK-NEXT: ldi r24, 214
+; CHECK-NEXT: ldi r25, 198
+; CHECK-NEXT: icall
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: subi r24, 251
+; CHECK-NEXT: sbci r25, 255
+; CHECK-NEXT: sbci r18, 255
+; CHECK-NEXT: sbci r19, 255
+
   %1 = call i32 %foo(i32 3335977107)
   %2 = add nsw i32 %1, 5
   ret i32 %2
@@ -200,10 +202,12 @@ define i32 @externcall(float %a, float %b) {
 ; CHECK: movw r20, [[REG1]]
 ; CHECK: call __divsf3
 ; CHECK: call foofloat
-; CHECK: subi r22, 251
-; CHECK: sbci r23, 255
-; CHECK: sbci r24, 255
+; CHECK: movw r18, r22
+; CHECK: subi r24, 251
 ; CHECK: sbci r25, 255
+; CHECK: sbci r18, 255
+; CHECK: sbci r19, 255
+
   %1 = fdiv float %b, %a
   %2 = call i32 @foofloat(float %1)
   %3 = add nsw i32 %2, 5
diff --git a/test/CodeGen/AVR/div.ll b/test/CodeGen/AVR/div.ll
index 7626ecb8172..7ff8ae33456 100644
--- a/test/CodeGen/AVR/div.ll
+++ b/test/CodeGen/AVR/div.ll
@@ -44,8 +44,9 @@ define i16 @sdiv16(i16 %a, i16 %b) {
 define i32 @udiv32(i32 %a, i32 %b) {
 ; CHECK-LABEL: udiv32:
 ; CHECK: call __udivmodsi4
-; CHECK-NEXT: movw r22, r18
-; CHECK-NEXT: movw r24, r20
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: movw r22, r24
+; CHECK-NEXT: movw r24, r18
 ; CHECK-NEXT: ret
   %quot = udiv i32 %a, %b
   ret i32 %quot
@@ -55,8 +56,9 @@ define i32 @udiv32(i32 %a, i32 %b) {
 define i32 @sdiv32(i32 %a, i32 %b) {
 ; CHECK-LABEL: sdiv32:
 ; CHECK: call __divmodsi4
-; CHECK-NEXT: movw r22, r18
-; CHECK-NEXT: movw r24, r20
+; CHECK-NEXT: movw r18, r22
+; CHECK-NEXT: movw r22, r24
+; CHECK-NEXT: movw r24, r18
 ; CHECK-NEXT: ret
   %quot = sdiv i32 %a, %b
   ret i32 %quot
diff --git a/test/CodeGen/AVR/rem.ll b/test/CodeGen/AVR/rem.ll
index 47573e8dafc..eaa12b79774 100644
--- a/test/CodeGen/AVR/rem.ll
+++ b/test/CodeGen/AVR/rem.ll
@@ -42,6 +42,8 @@ define i16 @srem16(i16 %a, i16 %b) {
 define i32 @urem32(i32 %a, i32 %b) {
 ; CHECK-LABEL: urem32:
 ; CHECK: call __udivmodsi4
+; CHECK: movw r22, r20
+; CHECK: movw r24, r18
 ; CHECK-NEXT: ret
   %rem = urem i32 %a, %b
   ret i32 %rem
@@ -51,6 +53,8 @@ define i32 @urem32(i32 %a, i32 %b) {
 define i32 @srem32(i32 %a, i32 %b) {
 ; CHECK-LABEL: srem32:
 ; CHECK: call __divmodsi4
+; CHECK: movw r22, r20
+; CHECK: movw r24, r18
 ; CHECK-NEXT: ret
   %rem = srem i32 %a, %b
   ret i32 %rem

dylanmckay commented 6 years ago

I have a suspicion

Functions with return types satisfying retType->getNumContainedTypes() > 1 && retType->getNumContainedTypes() > RVLocs.size() are no longer reversed

Unsure if this is intentional.

diff --git a/lib/Target/AVR/AVRISelLowering.cpp b/lib/Target/AVR/AVRISelLowering.cpp
index 8b7282f0c71..169c8e5cda2 100644
--- a/lib/Target/AVR/AVRISelLowering.cpp
+++ b/lib/Target/AVR/AVRISelLowering.cpp
@@ -1303,40 +1303,44 @@ SDValue AVRTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
 static void ReverseArgumentsToBigEndian(MachineFunction &MF,
                                         SmallVector<CCValAssign, 16> &RVLocs) {
   if (RVLocs.size() > 1) {
     // Some hackery because SelectionDAGBuilder does not split
     // up arguments properly
     Type *retType = MF.getFunction().getReturnType();
     if (retType->isStructTy()) {
       if (retType->getNumContainedTypes() > 1 &&
           retType->getNumContainedTypes() > RVLocs.size()) {
         for (unsigned i = 0, pos = 0;
             i < retType->getNumContainedTypes(); ++i) {
           Type *field = retType->getContainedType(i);
           if(field->isIntegerTy() && field->getIntegerBitWidth() > 16) {
             int Size = field->getIntegerBitWidth() / 16;
             std::reverse(RVLocs.begin()+ pos, RVLocs.end() + pos + Size);
             pos += Size;
           } else {
             pos++;
           }
         }
+      } else {
+        // FIXME: In the old version of this logic, this else branch
+        // would still reverse the return values. Old logic reversed if and
+        // always if RVLocs.size() > 1
+        // std::reverse(RVLocs.begin(), RVLocs.end()); ??
       }
     } else {
       std::reverse(RVLocs.begin(), RVLocs.end());
     }
   }
 }

brainlag commented 6 years ago

So this means you hitting the struct path without having any structs?

dylanmckay commented 6 years ago

It seems it, yeah. Unless there's something else going on like another location that needs to be reverse'd. I saw one last std::reverse call, but it was related to function arguments. I am unsure if that could be the problem though.

carlos4242 commented 5 years ago

This is a very old thread. But I'm joining in. 😄 The stuff in here is mainlined in avr-llvm and now we're getting some fairly serious breaks that are affecting me. #130 seems to be caused by this. I think when you just get a normal 32 bit return value the words are getting swapped. I'm creating unit tests for this and I'll try to attach them here and to #130. But for example in a really simple case, if you have a function that returns a 32 bit int and at the end just calls another external function that returns a 32 bit int, it should have no statements between the call XXX and ret, because the value should be returned in r22-r25 from the extern function and then returned back immediately. In all cases it thinks things are the wrong way round so it swaps r22/r23 with r24/r25. That breaks the rem.ll, div.ll and call.ll tests in exactly that way. Plus it breaks swift for Arduino because my code was doing int16 multiplications and returning 0 every time, real world impact. lol.

Anyway, it's going to take me a long time to understand what's going on in this thread, let alone with the code because I'm such a n00b, @brainlag @shepmaster @dylanmckay you guys want to chip in?

carlos4242 commented 5 years ago

OK, think I've found what's causing all the problems @dylanmckay. In various bits of consolidation, etc. code got accidentally deleted from LowerCallResult.

The patch should have been this:

@@ -1315,10 +1344,8 @@ SDValue AVRTargetLowering::LowerCallResult(
   auto CCFunction = CCAssignFnForReturn(CallConv);
   CCInfo.AnalyzeCallResult(Ins, CCFunction);

-  if (CallConv != CallingConv::AVR_BUILTIN && RVLocs.size() > 1) {
-    // Reverse splitted return values to get the "big endian" format required
-    // to agree with the calling convention ABI.
-    std::reverse(RVLocs.begin(), RVLocs.end());
+  if (CallConv != CallingConv::AVR_BUILTIN) {
+    ReverseArgumentsToBigEndian(DAG.getMachineFunction(), RVLocs);
   }

But ended up as this:

   auto CCFunction = CCAssignFnForReturn(CallConv);
   CCInfo.AnalyzeCallResult(Ins, CCFunction);

-  if (CallConv != CallingConv::AVR_BUILTIN && RVLocs.size() > 1) {
-    // Reverse splitted return values to get the "big endian" format required
-    // to agree with the calling convention ABI.
-    std::reverse(RVLocs.begin(), RVLocs.end());
-   }

So ReverseArgumentsToBigEndian is never called from LowerCallResult. I'm pretty sure this is just an accident. I'll do a small PR over the weekend with some unit tests.

carlos4242 commented 5 years ago

Background, I think the way the code was working was this:

For results coming back from calls or being returned from a function that are i8 or i16, the standard register mapping works. This will give you r24 or r24/r25 as a pair.

If you had a result coming back from a call or function that was i32 or i64, SelectionDAGBuilder would ask "how many registers and what types will we use?" and get back "2 registers of i16" (or "4 registers of i16") These would be naively assigned using RetCC_AVR from AVRCallingConv.td as register 0 (the low 16 bits): r24/r25, register 1 (the high 16 bits): r22/r23, which is word swapped. To fix that, if there were more than two registers in a return value, the old code reversed them and they got assigned the other way around so it all worked, but only for returning simple scalar types. I'm not sure they had ever got struct type returns working. If this was largely unchanged from the old code on Source Forge, that code would have probably only ever been tested on LLVM IR from clang, which probably wouldn't often (ever?) produce a struct return.

carlos4242 commented 5 years ago

OK, I've come up with a patch: https://gist.github.com/carlos4242/f882c9f1213ef006feb6984acef291d7

crlf0710 commented 5 years ago

Built the official 8.0 release code with AVR target enabled, llced the code in OP, and llc segfaulted.

avr-rust / rust-legacy-fork

UNREACHABLE Impossible reg-to-reg copy #92