Closed nlewycky closed 17 years ago
Please move the perf issue to the PPC README. Thanks!
It's still a bug, just a perf one not a miscompile.
Since it sounds like this is a known design issue, I'm fine with setting this bug to LATER and assuming it'll get fixed in the rewrite. Optionally, this could be added to the PPC backend README just so it doesn't get lost.
The current hazard recognizer is something of a hack. Because it runs before regalloc, it doesn't work with spill code, doesn't handle multiple instrs in an sunit, etc. The right solution is to move it to a post-regalloc scheduler. Dale may have one going in the next month or three.
Since this isn't the bug, I suggest closing this and reopening a new one when the real cause is determined, seem ok?
-Chris
I'm not sure how it breaks crtbegin.o either. I wrote that comment before I knew that PPC was safe against such hazards. The observed behaviour is that if I link a program with the generated crtbegin.o it crashes in static initialization. If I add a nop between those two instructions in crtbegin.o, the linked program no longer crashes.
I left this bug open because it does appear that there is some bug in the PPC backend not sending all instructions to the hazard recognizer, but I no longer think it's responsible for the miscompile.
For reference, this is the generated code:
f: mflr 0 stw 0, 4(1) stwu 1, -16(1) mtctr 3 bctrl addi 1, 1, 16 lwz 0, 4(1) mtlr 0 blr
How does it break crtbegin.o? PPC chips have hardware interlocks, so a missing NOP should just be a performance issue, no?
The testcase to reproduce it is:
define void @f(i32 ()* %func) {
entry:
%tmp1 = tail call i32 %func( ) ;
Run "llc -mtriple=powerpc-linux-gnu" and you'll see "mtctr 3" right before the "bctrl" instruction. Placing a NOP between them fixes crtbegin.o.
Extended Description
There is code in the PPCHazardRecognizer to detect mtclr / bctrl hazards, however, this sequence is generated by an indirect call and is put into one SUnit. Only the first member of the SUnit (mtclr) is passed to the hazard recognizer by the DAG scheduler and so it doesn't detect the hazard.
This breaks crtbegin.o on PPC/Linux.