classilla / tenfourfox

Mozilla for Power Macintosh.
http://www.tenfourfox.com/
Other
275 stars 41 forks source link

IonMonkey PPC backend #178

Closed GoogleCodeExporter closed 5 years ago

GoogleCodeExporter commented 9 years ago
from dvander to m.dev.tech.js-engine today,

IonMonkey has now landed on mozilla-central (yay!). Largely this shouldn't 
affect anyone doing SpiderMonkey development, but in case it does, here are the 
big takeaway changes:

(0) Benchmarks (usually) get faster. Compiling the shell does not. Sorry :(

(1) By running the shell, the flags "--ion -m -n" are now implied by default. 
You can disable them respectively with "--no-ion", "--no-jm", and "--no-ti". 
Disabling TI disables IonMonkey.

In the browser, there is one new JS pref: "javascript.options.ion.content". We 
don't expose any other flags since they'd only exist to horribly break stuff.

(2) IonMonkey, unlike JM, does not use the interpreter stack to store local 
variables and frames. It uses the C stack. This means that cx->fp(), 
js_GetTopStackFrame(), etc, must not be used unless with great care. Even if 
you have a js::StackFrame, it is not okay to peek at it because it could be 
stale.

When in doubt, use the wonderful ScriptFrameIter class. It has abstractions for 
walking the stack and inspecting frames so you don't ever have to touch a 
js::StackFrame.

(3) Lastly, IonMonkey introduces new ways to get in and out of the JIT. 
Briefly, they are:

  (a) At function calls or loop edges, we may decide to run a script with IonMonkey. From C++, this goes through ion::Cannon.

  (b) A guard failure, type-inference invalidation, or GC can cause a "bailout". A bailout is when an Ion frame on the stack must be converted back into an interpreter frame. When this happens, interpreter frames are created for each JS frame in the Ion frame (there can be multiple because of inlining), and we resume running the function in the interpreter instead.

-David 

The stack frame stuff could be ... very interesting given our past 
misadventures with stack. For Judgment Day, we will simply disable it, but we 
should start working on it.

Original issue reported on code.google.com by classi...@floodgap.com on 11 Sep 2012 at 8:32

GoogleCodeExporter commented 9 years ago
http://hg.mozilla.org/mozilla-central/file/fdfaef738a00/js/src/configure.in 
implies that sparc has ENABLE_ION=0, so that should work for us also. We should 
wait to see if there is a SPARC version because we can pattern ourselves upon 
that.

Original comment by classi...@floodgap.com on 11 Sep 2012 at 8:39

GoogleCodeExporter commented 9 years ago
Full JM ABI compliance will be needed, and probably is needed due to some 
glitchiness in 17 anyway.

Original comment by classi...@floodgap.com on 18 Sep 2012 at 2:16

GoogleCodeExporter commented 9 years ago

Original comment by Tobias.N...@gmail.com on 19 Dec 2012 at 10:35

GoogleCodeExporter commented 9 years ago
Issue 150 has been merged into this issue.

Original comment by Tobias.N...@gmail.com on 19 Dec 2012 at 10:37

GoogleCodeExporter commented 9 years ago

Original comment by Tobias.N...@gmail.com on 19 Dec 2012 at 11:15

GoogleCodeExporter commented 9 years ago
Started work on this today. We will be the first big-endian port, it looks 
like. Converted about half the files for about 10% of the code base. I am 
patterning us after ARM.

Things I've noticed already: Ion allows us to get all 32 GPRs *and* FPRs in 
play. Jumps use a jump table.

Original comment by classi...@floodgap.com on 14 Jan 2013 at 5:11

GoogleCodeExporter commented 9 years ago
A lot of the LIR/MIR stuff is totally bonkers. I don't have enough experience 
with intermediate representations to make much sense of it, but then neither 
did the author of the ARM port (actual comment: "oh god, what is this code?").

Original comment by classi...@floodgap.com on 15 Jan 2013 at 3:48

GoogleCodeExporter commented 9 years ago
IonFrames-ppc.h tonight. I'm suspecting that the stack is not ABI compliant 
during run until it has to make a call, but I haven't fully done the ARM 
comparison against the ARM stack during a JaegerMonkey run.

Original comment by classi...@floodgap.com on 18 Jan 2013 at 3:15

GoogleCodeExporter commented 9 years ago
After a couple days of staring at it, I think I get IonFrames. It's taking 
IonCommonFrameLayout, which should have our linkage area and arguments (wink to 
Ben), then adding a frame descriptor. All the other stack frame layouts are 
overlaid on it and add additional data fields to the stack.

What I still need to figure out is how registers get spilled (right now they 
seem to spill to a "spill area" in the stack, but this doesn't seem ABI 
compliant) and how control is transferred.

Original comment by classi...@floodgap.com on 23 Jan 2013 at 4:33

GoogleCodeExporter commented 9 years ago
dvander replied to my query.

There is a common frame header, which consists of a return address and a
caller-pushed fake-o frame pointer (contains outgoing type and
framesize). After that each of the frames is quite different.

  OptimizedJS - pushed when calling into a JavaScript function
  Entry - pushed as part of EnterIon(), just acts as a delimiter
  Rectifier - pushed in between two OptimizedJS frames when the argument
              count doesn't match
  Bailed_JS - an OptimizedJS frame that has been bailed out
  Bailed_Rectifier - Rectifier frame that has been bailed out
      (these two are never pushed, the frametype bits are just modified.
       they can only exist as the most recent frame.)
  Exit - pushed when an OptimizedJS frame wants to call into C++
  Osr - unused, you can ignore it

Sample frame stacks might look like this from oldest to newest:
   [Entry, OptimizedJS, OptimizedJS, OptimizedJS, Exit]
   [Entry, OptimizedJS, Rectifier, OptimizedJS, Rectifier, OptimizedJS]
   [Entry, OptimizedJS, Bailed_JS]

Unfortunately I don't think the state is documented well, and it's got
lots of corners. It's also scattered across Ion*FrameLayout,
Bailouts.cpp, and IonFrames.cpp. The good news is that these days, the
majority of the frame code is identical across x86/x64/ARM, so PPC might
just fit in too.

Original comment by classi...@floodgap.com on 31 Jan 2013 at 3:42

GoogleCodeExporter commented 9 years ago
and npierron,

Hi Cameron,

On 01/30/2013 01:38 PM, David Anderson wrote:
 > The baseline compiler folks have been deep in this area lately too.
 >
 > There is a common frame header, which consists of a return address and a
 > caller-pushed fake-o frame pointer (contains outgoing type and
 > framesize). After that each of the frames is quite different.

This shared stack representation is used to store the return address and a 
descriptor.  All calls into/from Ion should register these fields, except a 
few callWithABI which side-effect free (GC-free).

The return address, in addition to "frequently" (see fake-exit frames 
detailed below) contain the back link, is used for indexing safepoints and 
snapshots which are respectively used for marking objects which are on the 
stack (see saveLive functions) and for recovering slots of inspected frames 
which is necessary for fun.arguments and for iterating inline frames.

The descriptor is a packed value which is computed statically except for 
fun.apply calls where we copy the arguments (see visitApplyArgsGeneric) and 
for the rectifier frame (see generateArgumentsRectifier).  It contains the 
type of the opened frame, and the number of bytes between the bottom of the 
opened frame and the top of the parent frame (see IonFrameIterator::prevFp).

We have different kind of frames:

 >    Entry - pushed as part of EnterIon(), just acts as a delimiter

The entry frame is pushed by one of the Trampoline function (see 
generateEnterJIT), and it marks the beginning of an Ion activation, and the 
end of the stack iteration for IonFrameIterator.

This frame is a JS Frame, as it need to push all actual arguments (the 
effective given by the caller) and the strict minimum expected by the callee 
(the formal arguments).  Then after pushing all arguments, we push the 
number of actual argument (might be less than the number of copied 
arguments, in case of underflow), followed by the CalleeToken.  The 
CalleeToken is either a JSScript pointer (if the low bit is 1), or a 
JSFunction pointer (if the low bit it 0).  Then this is the usual decriptor 
& return address.

 >    OptimizedJS - pushed when calling into a JavaScript function

This kind of frame is pushed for Ion-compiled functions.  The call is 
composed of actual arguments, the number of actual arguments, the callee 
token, the descriptor and the return address.

 >    Rectifier - pushed in between two OptimizedJS frames when the argument
 >                count doesn't match

If during the call we detect that the JSFunction expect more arguments 
(underflow of argument), then we are calling into generateArgumentsRectifier 
(setting the descriptor accordingly).  The rectifier pushes the missing 
formal arguments (set to UndefinedValue), followed by the actual arguments, 
followed the number of actual arguments, followed by the same calleeToken, a 
descriptor describing the recitfier frame size, and the return address into 
the rectifier.

 >    Exit - pushed when an OptimizedJS frame wants to call into C++

An exit frame is used to mark he point where we exit an Ion activation.  The 
minimum required is a descriptor and a return address.  Then depending on 
the kind of the exit frame we may expect different kind of arguments.  To 
identify the variety of exit frames, we use a footer which is used when 
marking arguments (see MarkIonExitFrame).

All exit frames have to set the thread-local ionTop of the runtime.  This 
variable is used to start iterating the stack when StackIter or when the GC 
needs it.  This is abstract under linkExitFrame.

We can distinguish 2 categories of exit frames, normal exit frames (callVM, 
see generateVMWrapper) and fake exit frames.  Normal exit frames are using a 
call to a VM wrapper to fill the footer and to link the exit frame.  Fake 
exit frame are inlined into the generated code (for DOM calls, and for 
functions called from Inline Caches) and they are using a fake return 
address which only serve as an index to find the safepoints and snapshots.

 >    Bailed_JS - an OptimizedJS frame that has been bailed out
 >    Bailed_Rectifier - Rectifier frame that has been bailed out
 >        (these two are never pushed, the frametype bits are just modified.
 >         they can only exist as the most recent frame.)

In case of bailout, we have to unwind the last frame.  The last frame then 
appear as an exit frame, as the size of an exit frame is different than the 
size of a OptimizedJS frame / Recitifer frame, we set the this Bailed_* 
variant to avoid resizing the frame in-place (see EnsureExitFrame).

 >    Osr - unused, you can ignore it
 >
 > Sample frame stacks might look like this from oldest to newest:
 >     [Entry, OptimizedJS, OptimizedJS, OptimizedJS, Exit]
 >     [Entry, OptimizedJS, Rectifier, OptimizedJS, Rectifier, OptimizedJS]
 >     [Entry, OptimizedJS, Bailed_JS]
 >
 > Unfortunately I don't think the state is documented well, and it's got
 > lots of corners. It's also scattered across Ion*FrameLayout,
 > Bailouts.cpp, and IonFrames.cpp. The good news is that these days, the
 > majority of the frame code is identical across x86/x64/ARM, so PPC might
 > just fit in too.

I guess you can follow what Marty did for ARM, as you are also sharing a 
link register.  Still this might be a bit more tricky if you cannot do thje 
same trick as ARM for pushing the pc ahead of the call, in which case you 
might have to store it after branching to the generate VM wrapper.
[...]
Don't hesitate to come back to us if you have any trouble ;)
I will update the documentation tomorrow with what I mentioned in this email.

https://wiki.mozilla.org/IonMonkey/Frames

Original comment by classi...@floodgap.com on 31 Jan 2013 at 3:45

GoogleCodeExporter commented 9 years ago
https://bugzilla.mozilla.org/show_bug.cgi?id=764876 shows our timeframe is 
getting awful short.

Original comment by classi...@floodgap.com on 21 Mar 2013 at 12:35

GoogleCodeExporter commented 9 years ago

Original comment by classi...@floodgap.com on 13 Apr 2013 at 12:18

GoogleCodeExporter commented 9 years ago
Assembler and macro assembler are complete enough to try to get this to compile.

Original comment by classi...@floodgap.com on 25 Apr 2013 at 5:10

GoogleCodeExporter commented 9 years ago
Assembler parses. Macro assembler parses, too, but seems to be missing a lot.

Original comment by classi...@floodgap.com on 29 Apr 2013 at 4:55

GoogleCodeExporter commented 9 years ago
MacroAssembler now compiles. Lots of unhappy build warnings I need to sort out 
before I can call stage 2 completed.

Original comment by classi...@floodgap.com on 30 Apr 2013 at 2:14

GoogleCodeExporter commented 9 years ago
Scary:

/Volumes/BruceDeuce/src/mozilla-21b/js/src/methodjit/PolyIC.h:403:27: warning: 
'js::mjit::ic::PICInfo::shapeReg' is too small to hold all values of 
'js::mjit::MacroAssemblerTypedefs::RegisterID {aka enum 
JSC::PPCRegisters::RegisterID}' [enabled by default]
/Volumes/BruceDeuce/src/mozilla-21b/js/src/methodjit/PolyIC.h:404:27: warning: 
'js::mjit::ic::PICInfo::objReg' is too small to hold all values of 
'js::mjit::MacroAssemblerTypedefs::RegisterID {aka enum 
JSC::PPCRegisters::RegisterID}' [enabled by default]

Original comment by classi...@floodgap.com on 30 Apr 2013 at 2:25

GoogleCodeExporter commented 9 years ago
The only way to deal with those warnings may be to not use a few registers,
unfortunately.

From methodjit/PolyIC.h:

struct PICInfo {
[...]
  RegisterID shapeReg : 5;        // also the out type reg
  RegisterID objReg   : 5;        // also the out data reg
};

as well as typeReg inside the union. Apparently it Has Been Decided that 32
registers are Enough For Anyone. The bit-packing looks to have been
carefully designed, so at least for now I would recommend just abandoning a
few GPRs. But these compiler warnings certainly can't be ignored, since
they will lead to horrible mis-JITing if the bitfields overflow.

FYI: I no longer have a functioning PPC machine, since my old G4 Powerbook
finally gave up last fall. I'm still keeping an eye on 10.4Fx, but I can't
really contribute code anymore since I don't have a machine to compile or
test with.

Original comment by magef...@gmail.com on 7 May 2013 at 3:11

GoogleCodeExporter commented 9 years ago
No worries. I appreciate your hard work on JM! It will still carry over into 
the new assembler.

In any case, I merely mention the warnings for amusement, since there will be 
no JM integration with Ion (I'm shooting for the Fx25 timeframe, and by then 
the baseline compiler will be well established).

Still working on the Lowering and MoveEmitter portions. MoveEmitter is going to 
need a lot of overhauling, since I wrote it originally with assumptions that no 
longer hold true for the current version of the MacroAssembler. I can see why 
there's no MIPS or SPARC port; this is really involved.

Original comment by classi...@floodgap.com on 7 May 2013 at 4:08

GoogleCodeExporter commented 9 years ago
JS now builds, but does not link yet. We still need

  "js::ion::CodeGeneratorPPC::visitDivI(js::ion::LDivI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitCompareD(js::ion::LCompareD*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitUnbox(js::ion::LUnbox*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitMinMaxD(js::ion::LMinMaxD*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitStoreSlotT(js::ion::LStoreSlotT*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::ToValue(js::ion::LInstruction*, unsigned long)", referenced from:
[...]
  "js::ion::CodeGeneratorPPC::visitAbsD(js::ion::LAbsD*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitTruncateDToInt32(js::ion::LTruncateDToInt32*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitTestIAndBranch(js::ion::LTestIAndBranch*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitSqrtD(js::ion::LSqrtD*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitOsrValue(js::ion::LOsrValue*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitShiftI(js::ion::LShiftI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::Assembler::TraceJumpRelocations(JSTracer*, js::ion::IonCode*, js::ion::CompactBufferReader&)", referenced from:
      js::ion::IonCode::trace(JSTracer*)   in libjs_static.a(Ion.o)
  "js::ion::CodeGeneratorPPC::visitTestDAndBranch(js::ion::LTestDAndBranch*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitSubI(js::ion::LSubI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitModPowTwoI(js::ion::LModPowTwoI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitBitNotI(js::ion::LBitNotI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitAddI(js::ion::LAddI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::ToOutValue(js::ion::LInstruction*)", referenced from:
[...]
  "js::ion::CodeGeneratorPPC::visitNotD(js::ion::LNotD*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitModMaskI(js::ion::LModMaskI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::FrameSizeClass::FromDepth(unsigned int)", referenced from:
      js::ion::CodeGenerator::link()    in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitRound(js::ion::LRound*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::Assembler::trace(JSTracer*)", referenced from:
      JS::AutoGCRooter::trace(JSTracer*)     in libjs_static.a(RootMarking.o)
  "vtable for js::ion::CodeGeneratorPPC", referenced from:
      __data@0 in libjs_static.a(Ion.o)
  "js::ion::CodeGeneratorPPC::visitLoadElementT(js::ion::LLoadElementT*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitUrshD(js::ion::LUrshD*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitImplicitThis(js::ion::LImplicitThis*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::generateEpilogue()", referenced from:
      js::ion::CodeGenerator::generate()    in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitNotI(js::ion::LNotI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::bailout(js::ion::LSnapshot*)", referenced from:
      js::ion::CodeGenerator::visitBoundsCheck(js::ion::LBoundsCheck*)  in libjs_static.a(CodeGenerator.o)
      js::ion::CodeGenerator::visitClampVToUint8(js::ion::LClampVToUint8*)  in libjs_static.a(CodeGenerator.o)
  "js::ion::Assembler::copyPreBarrierTable(unsigned char*)", referenced from:
      js::ion::IonCode::copyFrom(js::ion::MacroAssembler&) in libjs_static.a(Ion.o)
  "js::ion::CodeGeneratorPPC::visitGuardShape(js::ion::LGuardShape*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitBoxDouble(js::ion::LBoxDouble*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::AutoFlushCache::flushAnyway()", referenced from:  
[...]
  "js::ion::CodeGeneratorPPC::emitBranch(js::ion::Assembler::Condition, js::ion::MBasicBlock*, js::ion::MBasicBlock*)", referenced from:
  "js::ion::CodeGeneratorPPC::emitTableSwitchDispatch(js::ion::MTableSwitch*, js::ion::Register const&, js::ion::Register const&)", referenced from:
  "js::ion::CodeGeneratorPPC::visitMathD(js::ion::LMathD*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::FrameSizeClass::frameSize() const", referenced from:
[...]
  "js::ion::CodeGeneratorPPC::visitValue(js::ion::LValue*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitCompare(js::ion::LCompare*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::AutoFlushCache::~AutoFlushCache()", referenced from:
[...]
  "js::ion::CodeGeneratorPPC::visitGuardClass(js::ion::LGuardClass*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitInterruptCheck(js::ion::LInterruptCheck*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitLoadSlotV(js::ion::LLoadSlotV*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::Assembler::processCodeLabels(js::ion::IonCode*)", referenced from:
      js::ion::IonCode::copyFrom(js::ion::MacroAssembler&) in libjs_static.a(Ion.o)
  "js::ion::Assembler::copyDataRelocationTable(unsigned char*)", referenced from:
      js::ion::IonCode::copyFrom(js::ion::MacroAssembler&) in libjs_static.a(Ion.o)
  "js::ion::CodeGeneratorPPC::visitBitOpI(js::ion::LBitOpI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitRecompileCheck(js::ion::LRecompileCheck*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::bailoutIf(js::ion::Assembler::Condition, js::ion::LSnapshot*)", referenced from:
[...]
  "js::ion::CodeGeneratorPPC::visitCompareAndBranch(js::ion::LCompareAndBranch*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::PatchJump(js::ion::CodeLocationJump, js::ion::CodeLocationLabel)", referenced from:
[...]
  "js::ion::CodeGeneratorPPC::bailoutFrom(js::ion::Label*, js::ion::LSnapshot*)", referenced from:
[...]
  "js::ion::CodeGeneratorPPC::generateInvalidateEpilogue()", referenced from:
      js::ion::CodeGenerator::generate()    in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitCompareDAndBranch(js::ion::LCompareDAndBranch*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::generatePrologue()", referenced from:
      js::ion::CodeGenerator::generate()    in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::storeElementTyped(js::ion::LAllocation const*, js::ion::MIRType, js::ion::MIRType, js::ion::Register const&, js::ion::LAllocation const*)", referenced from:
[...]
  "js::ion::CodeGeneratorPPC::visitMulI(js::ion::LMulI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::FrameSizeClass::ClassLimit()", referenced from:
      js::ion::IonRuntime::initialize(JSContext*)     in libjs_static.a(Ion.o)
      js::ion::IonRuntime::initialize(JSContext*)     in libjs_static.a(Ion.o)
  "js::ion::CodeGeneratorPPC::visitModI(js::ion::LModI*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::Assembler::TraceDataRelocations(JSTracer*, js::ion::IonCode*, js::ion::CompactBufferReader&)", referenced from:
      js::ion::IonCode::trace(JSTracer*)   in libjs_static.a(Ion.o)
  "js::ion::CodeGeneratorPPC::visitDouble(js::ion::LDouble*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitBox(js::ion::LBox*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitLoadSlotT(js::ion::LLoadSlotT*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitCompareB(js::ion::LCompareB*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitMoveGroup(js::ion::LMoveGroup*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::CodeGeneratorPPC(js::ion::MIRGenerator*, js::ion::LIRGraph*)", referenced from:
      js::ion::CodeGenerator::CodeGenerator(js::ion::MIRGenerator*, js::ion::LIRGraph*) in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::generateOutOfLineCode()", referenced from:
      js::ion::CodeGenerator::generate()    in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitFloor(js::ion::LFloor*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitCompareBAndBranch(js::ion::LCompareBAndBranch*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)
  "js::ion::CodeGeneratorPPC::visitPowHalfD(js::ion::LPowHalfD*)", referenced from:
      vtable for js::ion::CodeGenerator in libjs_static.a(CodeGenerator.o)

I'll be lucky to have this done by next year. :(

Original comment by classi...@floodgap.com on 15 May 2013 at 2:34

GoogleCodeExporter commented 9 years ago
JaegerMonkey is dead, as are all bugs depending on it. We will implement 
BaselineCompiler first and hope it's good enough to get 24 bootstrapped.

Original comment by classi...@floodgap.com on 17 May 2013 at 4:39

GoogleCodeExporter commented 9 years ago
Almost there. Still left to define:

  "js::ion::CodeGeneratorPPC::emitBranch(js::ion::Assembler::DoubleCondition, js::ion::MBasicBlock*, js::ion::MBasicBlock*)", referenced from:
  "js::ion::Assembler::trace(JSTracer*)", referenced from:
  "js::AsmJSMachExceptionHandler::AsmJSMachExceptionHandler()", referenced from:
  "js::ion::AutoFlushCache::flushAnyway()", referenced from:
  "js::AsmJSMachExceptionHandler::release()", referenced from:
  "js::ion::AutoFlushCache::~AutoFlushCache()", referenced from:
  "js::AsmJSMachExceptionHandler::clearCurrentThread()", referenced from:

Original comment by classi...@floodgap.com on 27 May 2013 at 4:45

GoogleCodeExporter commented 9 years ago
LINKED SUCCESSFULLY

Original comment by classi...@floodgap.com on 28 May 2013 at 1:44

GoogleCodeExporter commented 9 years ago
Ben, if you're around, check me on how your G3/G4 trampolines are generated (I 
need to unwind them to get the branch target):

inline: lis ha(target)
inline: b/bc to trampoline
trampo: ori lo(target)
trampo: mtctr/b(c)ctr(l)

inline: b/bc directly to target

Original comment by classi...@floodgap.com on 15 Jun 2013 at 10:52

GoogleCodeExporter commented 9 years ago
Working from memory, the only addition is that in the direct branch case it's

inline: lis (offset to trampoline) // in case a future repatch needs the 
trampoline
inline: b/bc to target

after linking. 

Original comment by magef...@gmail.com on 16 Jun 2013 at 2:55

GoogleCodeExporter commented 9 years ago
Ugh. How can I tell the difference?

Original comment by classi...@floodgap.com on 16 Jun 2013 at 2:58

GoogleCodeExporter commented 9 years ago
Maybe I'll just disable the G3/G4 trampolines while I debug this.

Anyway, 24 now finally builds after working around Mozilla's crap in bug 
881882, but doesn't link. We're still missing

Undefined symbols:
  "js::IsAsmJSModuleNative(int (*)(JSContext*, unsigned int, JS::Value*))", referenced from:
      JS_CloneFunctionObject(JSContext*, JSObject*, JSObject*) in libjs_static.a(jsapi.o)
      js::NewFunction(JSContext*, JS::Handle<JSObject*>, int (*)(JSContext*, unsigned int, JS::Value*), unsigned int, JSFunction::Flags, JS::Handle<JSObject*>, JS::Handle<JSAtom*>, js::gc::AllocKind, js::NewObjectKind) in libjs_static.a(jsfun.o)
      BindNameToSlotHelper(JSContext*, js::frontend::BytecodeEmitter*, js::frontend::ParseNode*) in libjs_static.a(BytecodeEmitter.o)
      EmitFunc(JSContext*, js::frontend::BytecodeEmitter*, js::frontend::ParseNode*) in libjs_static.a(BytecodeEmitter.o)
      JSScript::getFunction(unsigned long) in libjs_static.a(Interpreter.o)
      MaybeCheckEvalFreeVariables(JSContext*, JS::Handle<JSScript*>, JS::Handle<JSObject*>, js::frontend::Parser<js::frontend::FullParseHandler>&, js::frontend::ParseContext<js::frontend::FullParseHandler>&) in libjs_static.a(BytecodeCompiler.o)
      MaybeCheckEvalFreeVariables(JSContext*, JS::Handle<JSScript*>, JS::Handle<JSObject*>, js::frontend::Parser<js::frontend::FullParseHandler>&, js::frontend::ParseContext<js::frontend::FullParseHandler>&) in libjs_static.a(BytecodeCompiler.o)
      js::frontend::CompileScript(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSScript*>, JS::CompileOptions const&, unsigned short const*, unsigned long, JSString*, unsigned int, js::SourceCompressionToken*) in libjs_static.a(BytecodeCompiler.o)
      js::frontend::CompileScript(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSScript*>, JS::CompileOptions const&, unsigned short const*, unsigned long, JSString*, unsigned int, js::SourceCompressionToken*) in libjs_static.a(BytecodeCompiler.o)
      js::ion::BaselineCompiler::emit_JSOP_LAMBDA()     in libjs_static.a(BaselineCompiler.o)
      js::ion::BaselineCompiler::emit_JSOP_DEFFUN()     in libjs_static.a(BaselineCompiler.o)
      js::ion::IonBuilder::jsop_lambda(JSFunction*)      in libjs_static.a(IonBuilder.o)
      js::ion::IonBuilder::jsop_deffun(unsigned int) in libjs_static.a(IonBuilder.o)
      js::ion::IonBuilder::jsop_deffun(unsigned int) in libjs_static.a(IonBuilder.o)
  "js::ion::BaselineCompilerARM::BaselineCompilerARM(JSContext*, JS::Handle<JSScript*>)", referenced from:
      js::ion::BaselineCompiler::BaselineCompiler(JSContext*, JS::Handle<JSScript*>) in libjs_static.a(BaselineCompiler.o)
  "js::CallAsmJS(JSContext*, unsigned int, JS::Value*)", referenced from:
      __data@0 in libjs_static.a(AsmJS.o)
  "js::LinkAsmJS(JSContext*, unsigned int, JS::Value*)", referenced from:
      __data@0 in libjs_static.a(AsmJS.o)
  "js::ion::ParallelGetPropertyIC::initializeAddCacheState(js::ion::LInstruction*, js::ion::AddCacheState*)", referenced from:
      vtable for js::ion::ParallelGetPropertyIC in libjs_static.a(IonCaches.o)
ld: symbol(s) not found
collect2: ld returned 1 exit status

Original comment by classi...@floodgap.com on 15 Jul 2013 at 12:22

GoogleCodeExporter commented 9 years ago
Linked on 24. Now to get BaselineCompiler up.

Original comment by classi...@floodgap.com on 15 Jul 2013 at 12:33

GoogleCodeExporter commented 9 years ago
BaselineCompiler except for two edge cases works.

It's time to optimize.

Original comment by classi...@floodgap.com on 11 Oct 2013 at 4:27

GoogleCodeExporter commented 9 years ago
I think I already mentioned this once but I think this should be taken into 
account in order to improve the JIT performance on G5 CPUs.
Here an excerpt from an IBM document describing the errata of the 970FX series 
(I'd think it applies to 970MP as well):
________________________________________________________________________________
____________________________
Erratum #20: Possible branch prediction performance degradation for 32-bit mode 
applications

Overview
In 32-bit mode applications only, the branch prediction mechanism will not 
always work as intended for Branch Conditional to Link Register/Count Register 
instructions (bclr[l], bcctr[l]).

Detailed Description
The upper 32-bits of the count or link registers (CTR or LR) are always being 
used in the branch conditional address compare for target prediction. Typically 
these bits are zero, but with backwards branching, the address subtraction 
could cause some non-zero values. These should be zero'd out for 32-bit mode 
applications, and thus could cause incorrect predictions which generate false 
internal prefetches.

Projected Impact
This only has performance implications (no functional impact) and only affects 
32-bit mode applications when performing backwards branching. With the bclr and 
bcctr instructions, this bug can result in branch misprediction and possible 
other penalties caused by attempts to prefetch from an invalid address, thus 
the desired performance gains of prediction will not be realized. The impact 
specifically for bclr is expected to be minimal since we anticipate that the 
upper 32 bits will usually be ‘0’ for move-to Link Register instructions.

Workaround
The software workaround is to clear out the upper 32-bits of the source when 
writing the CTR or LR via the move-to-CTR/LR instructions in 32-bit mode only.
rlwinm G1, G1, 0, 0, 31
mtctr/mtlr G1.

Status
A fix is not planned at this time for the PowerPC 970FX.
________________________________________________________________________________
____________________________
Given the fix is that simple I suggest this be added to the macro assemblers 
(I'm actually thinking of adding it to gcc as well - in case it isn't already 
built in).

Original comment by Tobias.N...@gmail.com on 22 Oct 2013 at 7:56

GoogleCodeExporter commented 9 years ago
As we don't use bclr (at least in Nitro, or is there some hidden use?), that's 
no problem.

And for bcctr, as I understand it, it only affects us if the value (including 
the normally ignored upper 32 bit) that mtctr moves into CTR is negative and 
has some of the higher 32 bits set. That in turn cannot happen in a 
lis->ori->mtctr sequence (the lis clears the upper 32 bit, I suppose) but only 
in a subf->mtctr sequence.

Seemingly after a subf instruction the upper 32 bits, or some of them may 
remain set, but the description doesn't say under what circumstances and 
conditions and the PowerISA only says that all 64 bits of the GPRs are taken 
into account during effective address computation, and that a subf forms a 2s 
complement of RA - which I think might be the source of those nasty high bits.

However, in Nitro a subf->mtctr sequence is emitted in all branchSub32() 
functions (and only there as far as I could find out) - so maybe one should 
just add that rlwinm workaround instruction after the subfo_rc, or would that 
wreck that stanza stuff?

Original comment by Tobias.N...@gmail.com on 22 Oct 2013 at 10:42

GoogleCodeExporter commented 9 years ago
I realize it's not as easy as I thought - branchSub32() has nothing to do with 
branch target address computation.
And yes, having that rlwinm emitted before mtctr would involve extending the 
branch stanza to five words.
But if I'm right bcctr is only used in the branch stanza and the effective 
address is never computed at runtime of the emitted code but always at compile 
time using a lis->ori sequence. And that would make Nitro (and hence MethodJIT) 
generated code unaffected by this erratum - but I didn't check PPCBC, maybe 
it's needed there?

Original comment by Tobias.N...@gmail.com on 23 Oct 2013 at 12:21

GoogleCodeExporter commented 9 years ago
Turns out "backwards branching" means branching to a lower address - and I 
guess that's quite frequent in JavaScript JIT code. Hence a fix might increase 
performance significantly - and especially the "possible other penalties caused 
by attempts to prefetch from an invalid address" should probably be avoided by 
all means.

So I guess it's time to extend the branch stanza...

Original comment by Tobias.N...@gmail.com on 23 Oct 2013 at 12:32

GoogleCodeExporter commented 9 years ago
Yes, I think we talked about this a little before. In fact, I did some looking 
at a related issue when I evaluated splitting up mtctr/bc(c)tr in the G5 
stanza, and actually, I think this will have a negative impact overall for the 
following reasons:

- Most of the stanzas are optimized to direct b/bl anyway, worst case 90% or 
more based on my statistics from PPCBC. (Correct, we don't use bcl anywhere 
that I can recall, even though we can generate it.)
- For the few that are not, the addresses are direct 32-bit loads from memory; 
they're not computed addresses, so the chance of the upper 32 bits being 
non-zero is very low.
- Since we have to widen all the branch stanzas for this (and for also 
inserting nops between the mtctr and bc(c)tr), it probably has a negative 
impact on cache for a likely rare interaction.

But that's just on my own analysis; I'm open to other ideas.

I have IonMonkey now to the point where it generates code and gets to the 
invalidation bailout, which I still have to write. I had to fix some endian 
problems in the Ion macro assembler first though, grr.

Original comment by classi...@floodgap.com on 23 Oct 2013 at 4:14

GoogleCodeExporter commented 9 years ago
(Or they're a lis/ori, which will clear the upper 32-bits anyway just like an 
lwz would.)

Original comment by classi...@floodgap.com on 23 Oct 2013 at 4:16

GoogleCodeExporter commented 9 years ago
I think I now understand somewhat more:
- as lis is an addis instruction (which is D-form) it will always set all 32 
high bits when loading a negative immediate (and it will clear them when 
loading a positive one), and a subsequent ori will not touch those bits
- we don't ever seem to load negative immediates using lis we don't have a 
problem here

Original comment by Tobias.N...@gmail.com on 23 Oct 2013 at 1:08

GoogleCodeExporter commented 9 years ago
Back to work on this for 26. For our test case

gdb7 js --ion-eager -e 'var i=0'

we get through codegen and it immediately invalidates (a check with #jsapi 
indicates this is expected behaviour). Invalidation requires overwriting a 
saved location in the invalidation epilogue, but we are way off, apparently:

[Codegen] Created IonScript 0x1d25240 (raw 0xe98010)
|>
[Snapshots] Recover PC & Script from the last frame.
[Snapshots] Creating snapshot reader
[Snapshots] Read snapshot header with frameCount 1, bailout kind 0 (ra: 1)
[Snapshots] Read pc offset 0, nslots 1
[Invalidate] Start invalidation.
<Invalidate [Invalidate]  Invalidate -e:1, IonScript 0x1d25240
[Invalidate] BEGIN invalidating activation
[Invalidate] #1 exit frame @ 0xbfffe8e4
[Invalidate] #2 Optimized JS frame @ 0xbfffe978, -e:1 (fun: 0x0, script: 
0x2a29100, pc 0xe98758)
[Codegen] ##patchWrite_Imm32 on 00e98750
[Invalidate]    ! Invalidate ionScript 0x1d25240 (ref 2) -> patching osipoint 
0xe988b4
[Codegen] patchWrite_NearCall: 0xe988b4 to 0xe99130 (offset 2172)

[Invalidate] END invalidating activation
|>
Assertion failure: token, at 
/Volumes/BruceDeuce/src/mozilla-26.0/js/src/jit/ppcosx/IonFrames-ppc.cpp:26

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
0x005beae0 in js::jit::InvalidationBailoutStack::checkInvariants (this=<value 
temporarily unavailable, due to optimizations>) at 
/Volumes/BruceDeuce/src/mozilla-26.0/js/src/jit/ppcosx/IonFrames-ppc.cpp:26
26          JS_ASSERT(token);
(gdb) disas 0xe988a0 0xe988c0
Dump of assembler code from 0xe988a0 to 0xe988c0:
0x00e988a0:     lwz     r6,12(r1)
0x00e988a4:     lwz     r5,8(r1)
0x00e988a8:     lwz     r4,4(r1)
0x00e988ac:     lwz     r3,0(r1)
0x00e988b0:     addi    r1,r1,72
0x00e988b4:     b       0xe99130
0x00e988b8:     stw     r25,68(r1)
0x00e988bc:     stw     r24,64(r1)
End of assembler dump.
(gdb) 

patchWrite_Imm32 gets passed a location after a call stanza, but it's wrong (we 
wound it back to an effective address minus PPC_CALL_STANZA_LENGTH). 0xe988b4 
seems logical.

Original comment by classi...@floodgap.com on 2 Jan 2014 at 4:04

GoogleCodeExporter commented 9 years ago
Actually, I don't think that's right either. We should look to see what we 
overwrote there too. The osiPoint *should* be the nop we stick in the 
invalidation epilogue. Where is that?

Original comment by classi...@floodgap.com on 2 Jan 2014 at 5:36

GoogleCodeExporter commented 9 years ago
The token assertion was fixed by looking at the stack. The frame pointer was 
off by 4 (so an off-by-one-word error). I patched this in IonFrames-ppc.cpp for 
now and the token seems to match.

After some more thought and work in the debugger, the safepoints seem to be 
placed (on arbitrary LIR instruction boundaries) by 
jit/CodeGenerator:generateBody. However, the osiPoint doesn't match anything 
that Ion thinks it should be.

[Codegen] Created IonScript 0x1d25240 (raw 0xe98010)
|>
[Snapshots] Recover PC & Script from the last frame.
trying 0000043c
trying 000008a8
[Snapshots] Creating snapshot reader
[Snapshots] Read snapshot header with frameCount 1, bailout kind 0 (ra: 1)
[Snapshots] Read pc offset 0, nslots 1
[Invalidate] Start invalidation.
<Invalidate [Invalidate]  Invalidate -e:1, IonScript 0x1d25240
[Invalidate] BEGIN invalidating activation
[Invalidate] #1 exit frame @ 0xbfffe8e4
[Invalidate] #2 Optimized JS frame @ 0xbfffe978, -e:1 (fun: 0x0, script: 
0x2a29100, pc 0xe98758)
[Codegen] ##patchWrite_Imm32 on 00e98750
[Invalidate]    ! Invalidate ionScript 0x1d25240 (ref 2) -> patching osipoint 
0xe988b4
[Codegen] patchWrite_NearCall: 0xe988b4 to 0xe99130 (offset 2172)

[Invalidate] END invalidating activation
|>
IonFrames-ppc.cpp: frame should be bfffe978
[Invalidate] IonScript 0x1d25240 has method 0x2a39100 raw 0xe98010
retaddr = 00e98754
disp = 00000744
trying 0000043c
trying 000008a8
trying 00000ce4
Assertion failure: false (MOZ_ASSUME_UNREACHABLE(Failed to find OSI point 
return address)), at 
/Volumes/BruceDeuce/src/mozilla-26.0/js/src/jit/Ion.cpp:1061

8a8 computes out to the instruction following 0xe988b4. So how did LR get so 
off base?

Original comment by classi...@floodgap.com on 3 Jan 2014 at 4:32

GoogleCodeExporter commented 9 years ago
0x00e98740:     bl      0xe98744 << LR is now 0xe98744
0x00e98744:     mflr    r12
0x00e98748:     addi    r12,r12,20
0x00e9874c:     stwu    r12,-4(r1) << push 0xe98758 into Ion frame as return 
address
0x00e98750:     .long 0x9e0 << original bl overwritten by patchWrite_Imm32 << 
LR is now 0xe98754
0x00e98754:     li      r0,2816
0x00e98758:     addi    r1,r1,-72
0x00e9875c:     stw     r25,68(r1)

Original comment by classi...@floodgap.com on 3 Jan 2014 at 4:42

GoogleCodeExporter commented 9 years ago
The call patchWrite_Imm32 overwrites is calling a VMWrapper. 

0x00e98720:     stwu    r3,-4(r1)
0x00e98724:     li      r0,5
0x00e98728:     stwu    r0,-4(r1)
0x00e9872c:     lis     r0,656
0x00e98730:     ori     r0,r0,1696
0x00e98734:     stwu    r0,-4(r1)
0x00e98738:     li      r0,2240
0x00e9873c:     stwu    r0,-4(r1)
0x00e98740:     bl      0xe98744
0x00e98744:     mflr    r12
0x00e98748:     addi    r12,r12,20
0x00e9874c:     stwu    r12,-4(r1)
0x00e98750:     bl      0xe79b10    <<<<<<<<
0x00e98754:     li      r0,2824
0x00e98758:     addi    r1,r1,-72 ; PushRegsInMask(save)
0x00e9875c:     stw     r25,68(r1)
0x00e98760:     stw     r24,64(r1)

0x00e79b10:     lis     r12,227
[...]
0x00e79b78:     bl      0x595440 
<_ZN2js3jit13DefVarOrConstEP9JSContextN2JS6HandleIPNS_12PropertyNameEEEjNS4_IP8J
SObjectEE>

js::jit::DefVarOrConst() then forces an invalidation and sends execution into 
the invalidation epilogue, which we control. The LR is getting pushed by our 
push LR (surprise surprise) in the epilogue. LR is as set above.

Disassembling further,

0x00e9875c:     stw     r25,68(r1) ; PushRegsInMask(save) continued from above
0x00e98760:     stw     r24,64(r1)
[...]
0x00e987d4:     lis     r3,-8531
0x00e987d8:     ori     r3,r3,48879 ; 0xdeadbeef ; see 
shared/CodeGenerator-shared.cpp "This first move is here"
[...]
0x00e98810:     mflr    r18
0x00e98814:     andi.   r0,r1,15
0x00e98818:     beq-    0xe98824
0x00e9881c:     li      r0,2636
0x00e98820:     trap
0x00e98824:     bl      0x5665a4 <_ZN2js3jit8TraceLIREjjjPKcS2_P8JSScriptPh>
0x00e98828:     li      r0,2636
0x00e9882c:     mtlr    r18
0x00e98830:     lwz     r18,0(r1)
0x00e98834:     mr      r1,r16
0x00e98838:     lfd     f13,80(r1) ; PopRegsInMask(save)
[...]
0x00e988ac:     lwz     r3,0(r1)
0x00e988b0:     addi    r1,r1,72 ; end of PopRegsInMask

0x00e988b4:     addi    r1,r1,-72    <<<<<< OSI point return address
0x00e988b8:     stw     r25,68(r1)

During codegen, however, Ion alleges that the OsiPoint is indeed eventually at 
0x00e98750:

[Codegen] 02056d1c --- lis r0,656 (0x2900000)
[Codegen] 02056d20 --- ori r0,r0,1696 (0x6a0)
[Codegen] 02056d24 --- stwu r0,-4(sp)
[Codegen] == callWithExitFrame(ion *) ==
[Codegen] == push(imm) ==
[Codegen] 02056d28 --- li r0,2240 (0x8c0)
[Codegen] 02056d2c --- stwu r0,-4(sp)
[Codegen] == call(IonCode) ==
[Codegen] 02056d30 --- bl .+4
[Codegen] 02056d34 --- mfspr r12,lr
[Codegen] 02056d38 --- addi r12,r12,20 (0x14)
[Codegen] 02056d3c --- stwu r12,-4(sp)
[Codegen] 02056d40 --- bl .+8
[Codegen] #label     ((1864))
[Codegen] instruction OsiPoint   <<<<<<<<<
[Codegen] == reserveStack(u32) ==
[Codegen] 02056d48 --- subi sp,sp,72
[Codegen] == store32(reg, adr) ==
[Codegen] 02056d4c --- stw r25,68(sp)
[Codegen] == store32(reg, adr) ==

Original comment by classi...@floodgap.com on 3 Jan 2014 at 5:33

GoogleCodeExporter commented 9 years ago
Okay, the problem is more complex than I thought. It seems to be barfing on the 
constant pools; it retroactively asks through masm.actualOffset() where 
location X got moved after the constant pools and we don't have that 
information, and this explains the variation between the return address and the 
expected location. So I hacked Ion.cpp to make a guess and pick the next 
location after it. And -e 'var i=0' works.

The proper solution is to rewrite Baseline and Ion to not use the JSC 
MacroAssembler at all, and use the ARM constant buffer implementation that does 
provide this information. Given that Mozilla is getting unhappy with the slow 
pace of progress on YARR and may defect to V8 irregexp, we may have to do this 
sooner than later.

Meanwhile, Ben, if you have any ideas, please advise. This is not a great hack 
even though so far it seems to work.

Now for additional simple scripts. The good news is that none of this regresses 
Baseline.

Original comment by classi...@floodgap.com on 4 Jan 2014 at 5:32

GoogleCodeExporter commented 9 years ago
bz on #jsapi suggested hacking the shell so that jsop_popv instructions aren't 
generated, which Ion does not (yet?) support. So I did that.

Original comment by classi...@floodgap.com on 4 Jan 2014 at 7:45

GoogleCodeExporter commented 9 years ago
Current status with --ion-eager:

- 'var i=0' passes
- 'print(3)' passes: fixed native call register allocation
- 'var i=3; print(i)' passes: fixed a problem with retarget() not rewinding 
branch stanzas
- No PPCBC regression so far

Original comment by classi...@floodgap.com on 5 Jan 2014 at 5:37

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Fixed an endian problem in LIR slot allocation (in two places), revised 
branching macroops to allow rollback and rewrote the bailout thunks and tables. 
Our bailout table now just looks like a block of call stanzas, which seems the 
simplest arrangement. Not quite to the point of executing loops yet. One 
troublesome issue is that Ion will frequently bail out to Baseline on very 
small scripts, which confounds testing because the script finishes before 
Baseline hops back into Ion.

Original comment by classi...@floodgap.com on 6 Jan 2014 at 4:08

GoogleCodeExporter commented 9 years ago
Fixed a couple more assertions. We now run

var i,j=0; for(i=0;i<50000;i++) { j+=i } print(j);

though it bails out to Baseline. However, it does not assert anymore. 500,000 
does assert; we have a bug in the double MacroOps somewhere that I need to fix.

Original comment by classi...@floodgap.com on 7 Jan 2014 at 4:51

GoogleCodeExporter commented 9 years ago
The 500,000 loop case no longer asserts in either --ion-eager or regular, but 
it overflows the integer and does not seem to properly emit double code (though 
it does detect the overflow, and a bailout occurs).

Original comment by classi...@floodgap.com on 13 Jan 2014 at 3:53

GoogleCodeExporter commented 9 years ago
Starting program: /Volumes/BruceDeuce/src/mozilla-26.0/obj-ff-dbg/dist/bin/js 
--ion-eager -e var\ i,j=0\;\ for\(i=0\;i\<500000\;i++\)\ \{\ if\ \(\!\(i\ %\ 
10000\)\)\ print\(j\)\;\ j+=i\ \}\ print\(j\)\;
warning: Could not find malloc init callback function.  
Make sure malloc is initialized before calling functions.
Reading symbols for shared libraries 
....................................................................+++ done
[Bailouts] Took invalidation bailout! Snapshot offset: 10
[Bailouts] Took bailout! Snapshot offset: 40
0
49995000
199990000
449985000
799980000
1249975000
1799970000
[Bailouts] Took bailout! Snapshot offset: 112
[Bailouts] Took bailout! Snapshot offset: 88
[Bailouts] Took bailout! Snapshot offset: 63
2147581953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
2147651953
[Bailouts] Took bailout! Snapshot offset: 98
2147651953

Original comment by classi...@floodgap.com on 13 Jan 2014 at 3:54

GoogleCodeExporter commented 9 years ago
Fixed. Bad store slot instruction. Now for literal doubles.

Original comment by classi...@floodgap.com on 14 Jan 2014 at 1:32