llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.25k stars 12.08k forks source link

ARM, Hexagon, and MIPS emit return blocks for which MBB.isReturnBlock returns false #31308

Open rnk opened 7 years ago

rnk commented 7 years ago
Bugzilla Link 31960
Version trunk
OS Windows NT
CC @d0k,@rovka,@jmolloy,@kparzysz-quic,@MatzeB,@nigelp-xmos,@rengolin

Extended Description

Kyle noticed that there are many ways that MachineBasicBlock::isReturnBlock can return false for blocks that actually contain epilogues. It would be nice if this predicate could remain accurate after prologue/epilogue insertion, because it informs machine block placement and other optimizations. This came up during review here: https://reviews.llvm.org/D29153

In particular, the ARM load store optimizer turns tBX_RET pseudo-instrs into tBX indirect branch instructions, which are not flagged with isReturn. Other targets (Mips, Hexagon) also seem to have this problem. You can find blocks ending in indirect branches that aren't returns with this patch:

diff --git a/lib/CodeGen/MachineBlockPlacement.cpp b/lib/CodeGen/MachineBlockPlacement.cpp
index 8a57f00..2ecd083 100644
--- a/lib/CodeGen/MachineBlockPlacement.cpp
+++ b/lib/CodeGen/MachineBlockPlacement.cpp
@@ -2302,6 +2302,15 @@ bool MachineBlockPlacement::runOnMachineFunction(MachineFunction &MF) {
   if (skipFunction(*MF.getFunction()))
     return false;

+  for (auto &MBB : MF) {
+    if (!MBB.isReturnBlock() && MBB.succ_empty() && !MBB.empty() &&
+        MBB.back().isIndirectBranch()) {
+      MBB.dump();
+      llvm_unreachable(
+          "found no-successor indirect branch not flagged as return");
+    }
+  }
+
   // Check for single-block functions and skip them.
   if (std::next(MF.begin()) == MF.end())
     return false;

Here's an example where the ARM load/store optimizer creates this inconsistency:

$ echo 'define i32 @&#8203;simpleframe(<6 x i32>* %p) #&#8203;0 {
entry:
  %0 = load <6 x i32>, <6 x i32>* %p, align 16
  %1 = extractelement <6 x i32> %0, i32 0
  %2 = extractelement <6 x i32> %0, i32 1
  %3 = extractelement <6 x i32> %0, i32 2
  %4 = extractelement <6 x i32> %0, i32 3
  %5 = extractelement <6 x i32> %0, i32 4
  %6 = extractelement <6 x i32> %0, i32 5
  %add1 = add nsw i32 %1, %2
  %add2 = add nsw i32 %add1, %3
  %add3 = add nsw i32 %add2, %4
  %add4 = add nsw i32 %add3, %5
  %add5 = add nsw i32 %add4, %6
  ret i32 %add5
}
' | llc -mtriple=thumbv4t-none--eabi

It would be nice to clean this up eventually by adding more duplicate indirect branch pseudo instructions to carry the isReturn flag.

kparzysz-quic commented 7 years ago

Predication (if conversion) happens after PEI, so the predicated returns on Hexagon appear after PEI has run. Currently the Hexagon frame lowering code does, in fact, rely on MachineBasicBlock::isReturnBlock, which is not the best thing to do (since we have early predication that could in theory predicate a return, if not now then maybe in the future). On the positive note (for Hexagon, again), we do most of the prolog/epilog insertion ourselves and what the generic PEI code does does not affect us that much. We can eliminate the reliance on isReturnBlock completely within Hexagon-specific code.

MatzeB commented 7 years ago

I think this all would become cleaner if we abandon the concept of a per-instruction isReturn flag and instead add a flag to the machine basic block marking it as a return block?

I'm not sure that this would work. If a block has a predicated return, is it returning or not? If an optimization manages to prove that the predicate is always false, and removes the predicated instruction, how should it know to clear the "returning" flag from the block?

How can a predicated return work in practice? The isReturn flag is used to determine whether we need to place epilogue code, restore callee saves etc. There is no way in current CodeGen to deal with predicated returns (would we predicate all the epilog code?).

MatzeB commented 7 years ago

The alternative would be adding a return (and call?) MIFlag. I just hesitated to propose that as the flags are a scarce resource (we only have 4 bits left), so we should be sure before using those up.

kparzysz-quic commented 7 years ago

I think this all would become cleaner if we abandon the concept of a per-instruction isReturn flag and instead add a flag to the machine basic block marking it as a return block?

I'm not sure that this would work. If a block has a predicated return, is it returning or not? If an optimization manages to prove that the predicate is always false, and removes the predicated instruction, how should it know to clear the "returning" flag from the block?

kparzysz-quic commented 7 years ago

I applied that patch and I didn't see any Hexagon tests failing. If you have another Hexagon testcase that shows this behavior, could you attach it here directly?

kparzysz-quic commented 7 years ago

A block can contain a trap or a call to a non-returning function. It will not have any successors, but it won't be a returning block either.

rengolin commented 7 years ago

Thanks for the quick response, Simon!

With the possibility of any instruction that can have the PC as an operand, the ARM case may be a bit harder to get there, if at all, following Matthias' comments.

I don't think we can / should create one pseudo to all variations of opcode+PC, so I'm inclined to consider his suggestion to make this part of the MIR structure, not opcodes.

llvmbot commented 7 years ago

The non-MIPS test cases that fail with Reid's patch are:

LLVM :: CodeGen/ARM/arm-and-tst-peephole.ll
LLVM :: CodeGen/ARM/atomic-cmpxchg.ll
LLVM :: CodeGen/ARM/bit-reverse-to-rbit.ll
LLVM :: CodeGen/ARM/debug-frame-no-debug.ll
LLVM :: CodeGen/ARM/debug-frame-vararg.ll
LLVM :: CodeGen/ARM/global-merge.ll
LLVM :: CodeGen/ARM/machine-licm.ll
LLVM :: CodeGen/ARM/smml.ll
LLVM :: CodeGen/Thumb/dyn-stackalloc.ll
LLVM :: CodeGen/Thumb/long_shift.ll
LLVM :: CodeGen/Thumb/segmented-stacks-dynamic.ll
LLVM :: CodeGen/Thumb/segmented-stacks.ll
LLVM :: CodeGen/Thumb/select.ll
LLVM :: CodeGen/Thumb/stack_guard_remat.ll
LLVM :: CodeGen/Thumb/thumb-shrink-wrapping.ll
LLVM :: CodeGen/Thumb/unord.ll
LLVM :: CodeGen/Thumb/vargs.ll
LLVM :: CodeGen/XCore/llvm-intrinsics.ll

r295109 fixes the MIPS test cases. These were MIPS16 return instruction definitions that were not marked as isReturn = 1.

MatzeB commented 7 years ago

The isReturn flag is defined per opcode. In practice an instruction can sometimes be used as a "return" or as something else. Today this forces us to create new pseudo instructions that duplicate an existing instruction for the sake of setting the isReturn flag (look at all the tailcall calls we have around) or we may miss the "isReturn" variant of an instruction so for a quick fix people just use the existing one without the flag.

I think this all would become cleaner if we abandon the concept of a per-instruction isReturn flag and instead add a flag to the machine basic block marking it as a return block?

rengolin commented 7 years ago

This seems like a simple investigation, hopefully it'll be just the load store optimiser. I'm adding a few more people that have touched that part of the code recently, as it may be a simple fix.