Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

PPC: condition register bits and registers confused #1847

Closed Quuxplusone closed 9 years ago

Quuxplusone commented 17 years ago
Bugzilla Link PR1765
Status RESOLVED FIXED
Importance P normal
Reported by Nick Lewycky (nicholas@mxc.ca)
Reported on 2007-11-04 01:17:41 -0800
Last modified on 2014-12-09 08:07:56 -0800
Version trunk
Hardware PC Linux
CC evan.cheng@apple.com, fang@csl.cornell.edu, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, natebegeman@mac.com, nicolas.geoffray@gmail.com
Fixed by commit(s)
Attachments pr1765-1.patch (2881 bytes, text/plain)
creqv.patch (15385 bytes, text/plain)
creqv.patch (17319 bytes, text/plain)
creqv.patch (14980 bytes, text/plain)
Blocks
Blocked by
See also
The PPC backend currently emits code that doesn't distinguish between CR6 and
bit 6 (being a bit in CR1). This breaks PPC/Linux where "creqv 6, 6, 6" is
thought by the backend to be clobbering CR6 instead of CR1EQ.

Often the problem manifests in code like:
  creqv 2, 2, 2
  mr 6, 2
  bl somevararg1
  mr 6, 2
  bl somevararg2

and this crops up in 2005-03-07-VarArgs.c.

I'm not sure what the correct fix looks like. I tried modelling each CR-bit as
an i1 subreg of the CR and made a total mess of things. (I can attach the patch
if anyone wants it.)
Quuxplusone commented 17 years ago

I meant "mcrf" in the assembly, where I wrote "mr". Sorry for the confusion.

Quuxplusone commented 16 years ago

So the creqv 6,6,6 instruction has to be emitted at each variadic function call. I don't think we can use BuildMI when lowering a call into DAGs, right?.

The best solution I see is to create new calls instruction: {BCTRL,BL,BLA}_ELF_VARIADIC (we already have BCTRL_ELF and BCTRL_MachO). We then change PPCCodeEmitter.emitInstruction to emit besides the call the "creqv 6,6,6" instruction when emitting these new instructions.

Evan, since you helped me with this already, does that make sense?

Quuxplusone commented 16 years ago
That's not the problem. There's already code that builds the DAG for "creqv
6,6,6" before each function call.

The problem is that LLVM doesn't see the difference between *bit* 6 and
*register* 6 in the condition register.

Look at the definitions in PPCInstrInfo.td:

  def MCRF   : XLForm_3<19, 0, (outs CRRC:$BF), (ins CRRC:$BFA),
                        "mcrf $BF, $BFA", BrMCR>,
               PPC970_DGroup_First, PPC970_Unit_CRU;

  def CREQV  : XLForm_1<19, 289, (outs CRRC:$CRD), (ins CRRC:$CRA, CRRC:$CRB),
                        "creqv $CRD, $CRA, $CRB", BrCR,
                        []>;

  def SETCR  : XLForm_1_ext<19, 289, (outs CRRC:$dst), (ins),
                "creqv $dst, $dst,
                []>;

CRRC is defined as the 8 registers, CR0 to CR7. According to the above
definition, CREQV takes one of those registers.

It doesn't. The instruction definition is wrong.

In PPC asm, "creqv" takes a number from 0 to 31, being which bit inside of CR
is being hit (0..3 being CR0, 4..7 being CR1, etc). That's what isn't being
modelled here, the fact that CREQV is supposed to be operating on BITS while
MCRF is operating on 4-bit subregs.

Does that make sense?
Quuxplusone commented 16 years ago

Yes Nick, we agree on the problem.

What I don't see is how to model the solution in LLVM (but hey, I'm not the expert). I did a workaround when Jitting by always generating a "creqv 6,6,6" when generating calls. But that's ugly.

The simpliest solution I see is to do a BuildMI somewhere. But I don't know where.

Quuxplusone commented 16 years ago
Thans for the patch, this helped a lot. I made a new patch based on yours,
which seems to work on simple cases. I'll post it when it can be compiled with
TOT.

It's a shame though that the CopyCost flag is not taken into account in
copyToReg. With my patch, the generated code saves the CR register (the 32
bits) when starting the function and restores it at the end because the SETCR
instruction emits:
creqv 8,8,8

and the copyToReg emits:
cror 6,8,8

We should always emit:
creqv 6,6,6
Quuxplusone commented 16 years ago

Right, looks like you need to 1)model CRBIT registers, e.g. CR0LT, etc. as sub-registers of CR[0-7], 2) add a CRBIT register class, and 3) change CREQV (and others?) to operate on CRBITRC instead of CRRC.

Quuxplusone commented 16 years ago

Attached pr1765-1.patch (2881 bytes, text/plain): it's a start (makes things worse on its own)

Quuxplusone commented 16 years ago

Evan, that's exactly what my patch did. But it still doesn't work because LLVM thinks it can copy these registers (despite let CopyCost = -1), compare their values, etc. Any ideas what's going on?

Nicolas, I tried to un-bitrot your patch (LLVM sure moves quickly these days, huh?) but all I get is this error:

Cannot yet select: 0x8b92d00: i32 = zero_extend 0x8b8aaf0

when running it over varargs.bc. It also asserts on a function with parameter an i1 parameter.

Quuxplusone commented 16 years ago

Attached creqv.patch (15385 bytes, text/plain): Improve Nick's patch

Quuxplusone commented 16 years ago

Attached creqv.patch (17319 bytes, text/plain): Resync with TOT

Quuxplusone commented 16 years ago

Attached creqv.patch (14980 bytes, text/plain): Re-resync with TOT

Quuxplusone commented 16 years ago
+def SETCR  : XLForm_1_ext<19, 289, (outs CRBITRC:$dst), (ins),

Could you rename it CRSET? I found that as a mnemonic in binutils and elsewhere.
(eg. http://ecos.sourceware.org/ml/binutils/2001-02/msg00014.html )

--- PPCISelLowering.cpp (revision 48053)
+++ PPCISelLowering.cpp (working copy)
@@ -205,6 +205,7 @@
   setOperationAction(ISD::STACKRESTORE      , MVT::Other, Custom);
   setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i32  , Custom);
   setOperationAction(ISD::DYNAMIC_STACKALLOC, MVT::i64  , Custom);
+

Remove mystery additional blank line

+  } else if (RC == PPC::CRBITRCRegisterClass) {
+    if (SrcReg >= PPC::CR0LT || SrcReg <= PPC::CR0UN)
+      return StoreRegToStackSlot(TII, PPC::CR0, isKill, FrameIdx,
+                                 PPC::CRRCRegisterClass, NewMIs);
+    else if (SrcReg >= PPC::CR1LT || SrcReg <= PPC::CR1UN)
+      return StoreRegToStackSlot(TII, PPC::CR1, isKill, FrameIdx,
+                                 PPC::CRRCRegisterClass, NewMIs);

If someone tries to load/store the CR0EQ register, it'll clobber all of CR0? Is
that going to cause any problems?
Quuxplusone commented 16 years ago
> If someone tries to load/store the CR0EQ register, it'll clobber all of CR0?
Is
> that going to cause any problems?

No, it shouldn't cause any problem, it will be saved in prologue and restored
in epilogue.
Quuxplusone commented 16 years ago

That deserves a FIXME. The code isn't actually doing what it says it is (saving the individual CR bits) but it just happens to work due to facts outside of the scope of this method.

Quuxplusone commented 16 years ago

By the way, what makes you so sure that's true? The backend knows that CMP instructions clobber CR-regs. Couldn't it try to spill the bit out of the CR-reg, clobber the reg for the CMP, then restore the bit later? It could spill CR1EQ and that's fine because the only comparison result it cared about was CR1LT or something?

Quuxplusone commented 16 years ago
(In reply to comment #15)
> By the way, what makes you so sure that's true? The backend knows that CMP
> instructions clobber CR-regs. Couldn't it try to spill the bit out of the
> CR-reg, clobber the reg for the CMP, then restore the bit later? It could
spill
> CR1EQ and that's fine because the only comparison result it cared about was
> CR1LT or something?
>

OK, I see what you're saying. I added a FIXME in the code.
Quuxplusone commented 16 years ago

Is this fixed?

Quuxplusone commented 16 years ago

Sort of. Evan and Nicholas pointed out new problems with the patch. The patch is still definitely an improvement for this PR.

It should be closed, but as states Evan, a new PR must be open. I'd rather let Nicholas or Evan open it since I'm not sure I have all elements to make a comprehensive description of the new problems.

Quuxplusone commented 12 years ago

Hi, is this still a fix in-progress?

Quuxplusone commented 12 years ago

Hi David, no progress on my front. I don't know if a proper fix has been applied.

Quuxplusone commented 9 years ago
The PPC backend now properly tracks CR bits, and the current creqv definition
reads:

def CREQV  : XLForm_1<19, 289, (outs crbitrc:$CRD),
                               (ins crbitrc:$CRA, crbitrc:$CRB),
                      "creqv $CRD, $CRA, $CRB", IIC_BrCR,
                      [(set i1:$CRD, (not (xor i1:$CRA, i1:$CRB)))]>;

I assume this bug has now been fixed.