llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.25k stars 12.08k forks source link

Select128 problem #76618

Closed JonPsson1 closed 8 months ago

JonPsson1 commented 10 months ago

cat crash2.i

short a, c;
long b, d;
__int128 e;
int f;
static unsigned __int128 g(unsigned, unsigned __int128, __int128, unsigned,
                           unsigned char);
void h() { g(e, c != 0, e, b, a); }
unsigned __int128 g(unsigned i, unsigned __int128 j, __int128 k, unsigned l,
                    unsigned char m) {
  d = i;
  f = j;
  if (l | m)
    for (;;)
      ;
  return k;
}
void main() {}

/home/ijonpan/llvm-project/install/bin/clang -O3 -march=arch13 crash2.i -o a.out -w -fno-inline -mllvm -pre-RA-sched=list-ilp -mllvm -verify-misched Bad machine code: Call frame size on entry does not match value computed from predecessor

Looks like there is a Select128 inside the call frame setup (below ADJCALLSTACKDOWN), which makes the verifier complain when new blocks are created there.

uweigand commented 10 months ago

Hmm. Looks like common code expects that if a new block is created in between an ADJCALLSTACKDOWN / ADJCALLSTACKUP pair, its incoming "call frame size" is set to the call frame size at the end of the predecessor blocks. Some other targets explicitly maintain this invariant.

And indeed, I can fix this assertion failure by something along these lines:

diff --git a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
index 045c4c0aac07..4a8cb6d23c48 100644
--- a/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ b/llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -8154,6 +8154,11 @@ SystemZTargetLowering::emitSelect(MachineInstr &MI,
   MachineBasicBlock *JoinMBB  = SystemZ::splitBlockAfter(LastMI, MBB);
   MachineBasicBlock *FalseMBB = SystemZ::emitBlockAfter(StartMBB);

+  // Set the call frame size on entry to the new basic blocks.
+  unsigned CallFrameSize = TII->getCallFrameSizeAt(*LastMI);
+  FalseMBB->setCallFrameSize(CallFrameSize);
+  JoinMBB->setCallFrameSize(CallFrameSize);
+
   // Unless CC was killed in the last Select instruction, mark it as
   // live-in to both FalseMBB and JoinMBB.
   if (!CCKilled) {

Not sure why this didn't show up earlier. Also, we probably need to do this everywhere blocks are split.

However, what's a bit annoying is that on SystemZ, the call-frame instructions are in the end all no-ops anyway, since we don't adjust the stack on the fly, but rather create an outgoing argument area big enough for all calls in this function during the prolog. So maybe this "call frame size" value should just be zero everywhere on SystemZ?

@JonPsson1 can you take it from here?

JonPsson1 commented 10 months ago

As to why this hasn't shown up earlier, I tried with i64, in which case ofc an LOC is used. But with 'double', the SelectVR64 is emitted which also needs branching. For one reason or the other, the SelectVR64 is put above the ADJCALLSTACKDOWN by the scheduler. The Select128 may involve some additional instruction, like loading from the constant pool which then gives this schedule.

So it seems that the list-ilp scheduler in this case puts the Select128 inside the call frame setup, while it puts the SelectVR64 above it. The default scheduler puts both of them above it, but it seems theoretically possible that any Select would be put below the ADJCALLSTACKDOWN.

My best idea for handling this now is per the PR above.