Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Systemz backend - verify-machineinstrs error #5826

Closed Quuxplusone closed 7 years ago

Quuxplusone commented 14 years ago
Bugzilla Link PR5327
Status RESOLVED FIXED
Importance P normal
Reported by Sylvère Teissier (quickslyver@free.fr)
Reported on 2009-10-28 13:49:36 -0700
Last modified on 2017-03-07 08:48:27 -0800
Version trunk
Hardware PC Linux
CC anton@korobeynikov.info, florian_hahn@apple.com, jsweval@arxan.com, llvm-bugs@lists.llvm.org, llvm@sunfishcode.online
Fixed by commit(s)
Attachments __udivhi3.c (287 bytes, text/x-csrc)
Blocks
Blocked by
See also
Created attachment 3716
C source

$clang-cc __udivhi3.c -O3 -o - -S -triple=s390x-unknown-unknown -debug -verify-
machineinstrs

(...)

while.body11: 0xa9bbed4, LLVM BB @0xa98ea30, ID#12:
    Predecessors according to CFG: 0xa9bb780 (#7) 0xa9bbe88 (#11)
    %reg1053<def> = PHI %reg1031, mbb<while.body11,0xa9bbe88>, %reg1052,
mbb<while.body11,0xa9bb780>
    JL mbb<while.body11,0xa9bbfc4>, %PSW<imp-use>
    Successors according to CFG: 0xa9bbf78 (#13) 0xa9bbfc4 (#14)

(...)

*** Bad machine code: Using an undefined physical register ***
- function:    __udivhi3
- basic block: while.body11 0xa9bbed4 (#12)
- instruction: JL mbb<while.body11,0xa9bbfc4>, %PSW<imp-use>
- operand 1:   %PSW<imp-use>
fatal error: error in backend: Found 1 machine code errors.

and the final code is:
(...)
    jl  .LBB1_8
    lr  %r3, %r5
.LBB1_8:
    jl  .LBB1_10
    lr  %r2, %r11
.LBB1_10:
(...)

there is a big problem somewhere... but looking at debug output it's hard for
me to see how basic bloc #12 is created.
Quuxplusone commented 14 years ago

Attached __udivhi3.c (287 bytes, text/x-csrc): C source

Quuxplusone commented 14 years ago

This might be due to Dan's changes yesterday

Quuxplusone commented 14 years ago

It is. The code itself is not broken (as far as I'm aware), optimization aside. But the MachineVerifier and ISel have different expectations about how the physreg liveness attributes should be handled here.

Quuxplusone commented 14 years ago
Interesting thing:

If I enable the register scavenger with

Index: lib/Target/SystemZ/SystemZRegisterInfo.h
===================================================================
--- lib/Target/SystemZ/SystemZRegisterInfo.h    (revision 85412)
+++ lib/Target/SystemZ/SystemZRegisterInfo.h    (working copy)
@@ -75,6 +75,10 @@
   unsigned getEHHandlerRegister() const;

   int getDwarfRegNum(unsigned RegNum, bool isEH) const;
+
+  virtual bool requiresRegisterScavenging(const MachineFunction &MF) const {
+    return true;
+  }
 };

 } // end namespace llvm

The same command
$clang-cc __udivhi3.c -O3 -o - -S -triple=s390x-unknown-unknown -debug
-verify-machineinstrs

makes the register scavenger crash:

clang-cc: RegisterScavenging.cpp:196: void llvm::RegScavenger::forward():
Assertion `SubUsed && "Using an undefined register!"' failed.

So I don't think that the Machine verifier is too much alarmist.
Quuxplusone commented 14 years ago
The register scavenger is also being somewhat over-eager here too, because I
don't believe anything really needs to go scavenging for registers in PSW's
register class.  There's certainly an inconsistency here; different parts of
CodeGen are making conflicting assumptions.  LiveVariables currently assumes
that physical registers are not live across block boundaries.  ISel assumes
that it can split blocks in the middle to do things like create CFG triangles
to implement conditional moves.  In this testcase, it happens to be necessary
to split a block at a point where a physical register is live.

The same problem exists on x86 with EFLAGS.  Compile this with llc -march=x86-
64 -verify-machineinstrs:

declare void @foo(double, double)

define void @bar(double %a, double %b, double %c, double %d, double %e, double
%f) nounwind {
  %x = fcmp olt double %a, %b
  %y = select i1 %x, double %c, double %d
  %z = select i1 %x, double %e, double %f
  call void @foo(double %y, double %z)
  ret void
}

I did an experiment and forced ISel to mark EFLAGS as live-in in all the blocks
in the CFG triangle, however LiveVariables still marks the last use of EFLAGS
in the first block as a kill, so the verifier instead complains about a live-in
register not being live out of a predecessor.  Solving this may require
enhancing LiveVariables to have a basic understanding of physical registers
live across block boundaries.
Quuxplusone commented 14 years ago
Hi Dan, I found an another bug in your modifcations:

int print(char * a);
int f(int a);
extern int volatile *d;
int test(int a)
{
  int e;
  e=print(a?"true":"false");
  d[2]=e+5;
  if(a)
  {
    f(10);
  }
  return *d;
}

give at -O3 with clang

test:
    stg %r14, 112(%r15)

    cfi %r2, 0
    larl    %r2, .L.str
    jne .LBB1_2
    larl    %r2, .L.str1
.LBB1_2:
    brasl   %r14, print

    ahi %r2, 5
    larl    %r1, d
    lg  %r1, 0(%r1)
    st  %r2, 8(%r1)
    je  .LBB1_4
    lghi    %r2, 10
    brasl   %r14, f
.LBB1_4:
    larl    %r1, d
    lg  %r1, 0(%r1)
    lgf %r2, 0(%r1)
    lg  %r14, 112(%r15)
    br  %r14

there is a missing cfi %r2,0 before je .LBB1_4

it's because of a select and a brcond that share the same icmp but the select
is lowered with the function EmitInstrWithCustomInserter and the machineBlock
is cutted in two part so the brcond is finally alone without cfi.

I don't known if I explain well or not.
Ask me to reformulate if you don't understand.
Quuxplusone commented 14 years ago

The specific problem in that testcase is that the SystemZ backend's copyRegToReg does not support spilling the PSW register. And the backend isn't configured to allow the scheduler to duplicate the comparison.

I've now found a way to conditionalize the original optimization I did that started all this, so that it isn't used on targets which use MVT::Flag the way SystemZ previously did. This is now committed in r85638. Anton, I believe it is now safe for you to revert my 85357 change to the SystemZ backend to get it back to where it started before I got involved here (though there's a minor merge conflict to resolve), if you choose.

Quuxplusone commented 14 years ago
(In reply to comment #6)
> The specific problem in that testcase is that the SystemZ backend's
> copyRegToReg does not support spilling the PSW register. And the backend isn't
> configured to allow the scheduler to duplicate the comparison.
Well... I think it might be better to add instructions to save / restore CC
part of PSW :)
Quuxplusone commented 14 years ago
(In reply to comment #0)
> Created an attachment (id=3716) [details]
> C source
>
> $clang-cc __udivhi3.c -O3 -o - -S -triple=s390x-unknown-unknown -debug
> -verify-machineinstrs
>
> (...)
>
> while.body11: 0xa9bbed4, LLVM BB @0xa98ea30, ID#12:
>     Predecessors according to CFG: 0xa9bb780 (#7) 0xa9bbe88 (#11)
>         %reg1053<def> = PHI %reg1031, mbb<while.body11,0xa9bbe88>, %reg1052,
> mbb<while.body11,0xa9bb780>
>         JL mbb<while.body11,0xa9bbfc4>, %PSW<imp-use>
>     Successors according to CFG: 0xa9bbf78 (#13) 0xa9bbfc4 (#14)
>
> (...)
>
> *** Bad machine code: Using an undefined physical register ***
> - function:    __udivhi3
> - basic block: while.body11 0xa9bbed4 (#12)
> - instruction: JL mbb<while.body11,0xa9bbfc4>, %PSW<imp-use>
> - operand 1:   %PSW<imp-use>
> fatal error: error in backend: Found 1 machine code errors.
>
> and the final code is:
> (...)
>         jl      .LBB1_8
>         lr      %r3, %r5
> .LBB1_8:
>         jl      .LBB1_10
>         lr      %r2, %r11
> .LBB1_10:
> (...)
>
> there is a big problem somewhere... but looking at debug output it's hard for
> me to see how basic bloc #12 is created.
>

I have not time now to summit a tested patch, but the solution for this 1st bug
is to add  Defs = [PSW] to Select32 and 64 instructions in InstrInfo.td
Quuxplusone commented 14 years ago
(In reply to comment #7)
> (In reply to comment #6)
> > The specific problem in that testcase is that the SystemZ backend's
> > copyRegToReg does not support spilling the PSW register. And the backend
isn't
> > configured to allow the scheduler to duplicate the comparison.
> Well... I think it might be better to add instructions to save / restore CC
> part of PSW :)
>

It's not necessary you have just to implement getCrossCopyRegClass in
RegisterInfo like X86 backend does. I will patch if I have time
Quuxplusone commented 14 years ago
(In reply to comment #9)
> It's not necessary you have just to implement getCrossCopyRegClass in
> RegisterInfo like X86 backend does. I will patch if I have time
This surely won't work:
1. You need to save / restore only part of PSW, otherwise you'll change the
program counter.
2. You cannot use normal "lr" instruction to move from/to PSW, this is not a
normal register.

Basically, you need to use ipm instruction to copy CC & program mask part of
PSW to register and "tmh reg, 1228" to insert part of the reg into CC & program
mask fields of PSW
Quuxplusone commented 14 years ago

In fact if you implement getCrossCopyRegClass like in X86 backend, the scheduler will never try to copy PSW. It will try to copy PSW only if getCrossCopyRegClass return null like in base class default implementation.

Quuxplusone commented 14 years ago
(In reply to comment #11)
> In fact if you implement getCrossCopyRegClass like in X86 backend, the
> scheduler will never try to copy PSW. It will try to copy PSW only if
> getCrossCopyRegClass return null like in base class default implementation.
No, this is not true.

if getCrossCopyRegClass() is NULL, then it's possible to copy register directly
to/from any regclass, and direct reg-reg copy is inserted (so, in our case both
destreg and srcreg can be PSW...). If it returns non-null, then two copies are
inserted - one from source register to crosscopy regclass and another one -
from crosscopy regclass to dest reg.

In any case you need to have instructions to move to/from the specified
regclass. The only difference is that in the latter case you need to have one
small amount of reg-reg move instructions to be supported.
Quuxplusone commented 14 years ago
Yes in theory, but it's not implemented like that currently.
Look at ScheduleDAGRRList.cpp:761
Quuxplusone commented 14 years ago
(In reply to comment #13)
> Yes in theory, but it's not implemented like that currently.
> Look at ScheduleDAGRRList.cpp:761
Yeah, implemented exactly as I described. If DestRC is not null, then NewDef is
not null and thus extra copy is issued (block started from line 771).
Quuxplusone commented 14 years ago
Hum sorry, your rigth.
But this happen only when CopyAndMoveSuccessors fail (return null).
I hope this function rarely return null else it is very expensive to copy PSW
as you say :)
Quuxplusone commented 14 years ago

what's the status of this?

Quuxplusone commented 14 years ago
(In reply to comment #16)
> what's the status of this?
Multiple problems here :) But the general one is different code assumptions of
isel & machine verififier. It's not systemz-specific though.
Quuxplusone commented 7 years ago
This seems fixed in current master. I used the following steps to reproduce:

bin/clang -O3 -S -emit-llvm bug-sysZ.c
bin/llc -O3 -mtriple=s390x-linux-gnu -debug -verify-machineinstrs bug-sysZ.ll
Quuxplusone commented 7 years ago

I forgot to add: I did not manage to reproduce the failure with the steps in my previous comment.