apt1002 / mijit

Experimental JIT compiler generator
https://github.com/apt1002/mijit/
25 stars 4 forks source link

Forbid passing values in Registers between `code::Machine::State`s #6

Closed apt1002 closed 4 years ago

apt1002 commented 4 years ago

Currently the States of beetle::Machine pass values to each other in registers. If an exception occurs (e.g. the specializer runs) what will happen to the register contents?

I think this is a design bug. We should specify that registers are corrupted between States. States should use Globals to pass values. Beetle should have some extra Globals Op1, Op2, Target etc. for this purpose.

Mijit's cache will optimize away unnecessary LoadGlobal instructions. Mijit should do dead value analysis to optimize away unnecessary StoreGlobal instructions.

There should be a way to mark a Global as "observable" to suppress dead value analysis. This would force Mijit to maintain the value of the Global even if it is write-only. I can't think of an example in Beetle; NotAddress and Bad are close.

apt1002 commented 4 years ago

I'm going to hold off on this until we have register allocation working. I think the requirements will be clearer then.

apt1002 commented 4 years ago

Proposal

I've gone round in circles a bit but I think a good approach might be as follows:

If a trap occurs, only the globals will be preserved. x86 registers and all other spill slots are corrupted. The trap handler can re-enter the Machine in whatever State it wants, and is responsible for initializing all the spill slots that are live in that State.

Mijit instructions (as proposed) can be translated fairly directly into x86 code. If an operation has a source that is not a register, the load_op form of the instruction can be used. If it has two or more non-register sources, extra load instructions (into temporary registers) might be needed. If the destination is not a register it then a store instruction might be needed. In all cases, the register allocation decisions are adequately represented in the Mijit code.

The optimizer already has code for caching globals, in order to eliminate unnecessary LoadGlobal and StoreGlobal instructions when constructing the dataflow graph. This can be generalized to all spill slots.

The optimizer allocates both x86 registers and spill slots. It tries to keep values in registers if possible, but if it runs out, it can allocate spill slots for long-lived values. The output format of the optimizer is Mijit code, i.e. exactly the same as its input format. The only difference from the Machine specification is that registers can be used more freely.

The CallingConvention of a history specifies (among other things) which spill slots and x86 registers are live. For a root history, this is just the result of the liveness analysis for the corresponding State (so all x86 registers are dead). For a history constructed by the optimizer, liveness is inferred from its retire code (and x86 registers can be live).

apt1002 commented 4 years ago

We've made good progress on this. We have done the data structures and made consequential changes to the Beetle implementation.

We noticed that the design still does not allow us to generate good code for immediate constants. We decided to reserve one slot to hold the value 0, and in future we will probably generalise this to other constants. However, we also need in future to allow operands to be constants (#15).

Allowing any source or destination to be a slot (as opposed to a register) complicates Lowerer somewhat. For a typical binary operation (i.e. with two sources and a destination), there are eight cases to consider, and it is laborious to generate correct efficient code in every case.

The hardest case is when both dest and src1 are slots. In that case, the x86 code requires two registers in addition to those mentioned in the Mijit code. We statically reserved one register (RC) for this purpose but we have needed brittle tricks to find the other one, and in one bad case we resorted to pushing RA on the stack to make room.

This indicates that the Mijit code is inadequately representing the register allocation decisions. I think it is worth a slight rethink.

Proposal (#16): the destination must be a register (not a slot). This allows us to delete half the cases, including all the hard cases. It shifts the burden of choosing a destination register onto whatever generates the Mijit code, thereby better representing the register allocation in the code. Spilling a computed value requires an extra Mijit instruction, which is fine.