microvm / microvm-meta

We have moved: https://gitlab.anu.edu.au/mu/general-issue-tracker
https://gitlab.anu.edu.au/mu/general-issue-tracker
3 stars 0 forks source link

ASM-style IR builder #56

Open wks opened 8 years ago

wks commented 8 years ago

This issue discusses a higher-level abstraction over the IR builder API. It will allow the client to construct Mu IR CFG in a stateful style. The stateful builder will hold a pointer to the "current basic block" at any time. New instructions are implicitly appended to the end of the current basic block. Such interface can also emulate fall-through-style ASM instructions, such as JL, JE, JNE, etc.

It is a layer above the API. The muapi.h should still be kept minimal.

There is a problem in implementation. Such builder is easy to build in SSA form, but since we have switched to the "goto-with-values" form, more book-keeping needs to be done in the client. Probably we still need a soup of objects in the client and do liveness analysis and convert SSA to goto-with-value.

wks commented 8 years ago

The client Mu IR code generator (especially for tracing JIT compilers) usually desire an assembly-style Mu IR builder. Since the trace is a long list of instructions with side exits, the natural counterpart at a lower level is a list of ASM instructions with fall-through-stule branching instructions.

However, the Mu IR has explicit basic blocks. Branching instructions must specify all destinations. This makes the client's compiler inconvenient. It is also verbose to have to mention the MuCtx and the basic block every time an instruction is created.

The solution is to provide a wrapper that keeps a reference to the "current basic block". All instructions that create instructions will implicitly use the current basic block.

class StatefulIRBuilder:
    def __init__(self, ctx, func_ver):
        self.ctx = ctx
        self.func_ver = func_ver
        self.cur_bb = None

    def set_bb(self, bb):
        self.cur_bb = bb

    def binop(self, optr, ty, lhs, rhs):
        inst = self.ctx.new_binop(self.cur_bb, optr, ty, lhs, rhs)  # cur_bb is implicit to the caller
        result = self.ctx.new_inst_res(inst)
        return result

    def cmpxchg(self, is_ptr, is_weak, ord_succ, ord_fail, loc, expected, desired):
        inst = self.ctx.new_cmpxchg(self.cur_bb, is_ptr, is_weak, ord_succ, ord_fail, loc, expected, desired)
        old_value = self.ctx.new_inst_res(inst)    # the CMPXCHG instruction has two results
        is_succ = self.ctx.new_inst_res(inst)    # the second is an int<1>, whether CMPXHG is successful
        return (old_value, is_succ)

    def call(self, sig, func, args, nresults, keepalives=[]):
        inst = self.ctx.new_call(self.cur_bb, sig, func, args)
        self.ctx.set_keepalives(inst, keepalives)   # One call to StatefulIRBuilder method may involve many low-level calls
        results = []    # The CALL instruction may return many results, depending on the callee's signature
        for i in range(nresults):
            results.append(self.ctx.new_inst_res(inst))
        return results

    def jmp(self, bb, args):
        inst = self.ctx.new_branch(self.cur_bb)
        self.ctx.set_dest(inst, NORMAL, bb, args)
        self.set_bb(bb)   # since BRANCH is a terminating instruction, the builder automatically set itself to point to the destination
        return None

    def jt(self, cond, bb, args):
        """jump if cond=1"""
        fall_through = self.ctx.new_bb(self.func_ver)
        # TODO: We use the goto-with-values form. Live variables must be passed to the next variables
        for var in self.live_variables_in_the_current_basic_block():
            param = self.ctx.new_nor_param(fall_through, type_of(var))
        inst = self.ctx.branch2(cond)
        self.ctx.add_dest(inst, IF_TRUE, bb, args)
        self.ctx.add_dest(inst, IF_FALSE, fall_through, self.live_variables_in_the_current_basic_block())
        self.set_bb(fall_through)
        return None

Problem with the goto-with-values form

One difficulty is that since we switched to the goto-with-values form, basic block parameters need to be explicit. If we were using the traditional SSA form, variables from previous blocks can be used directly. Now the builder not only has to record all live variables (it probably cannot infer whether they are really "live", because it needs to look into the future), but also know the types of those variables as well, which needs extra book-keeping.

This prompts me to argue that we probably need an intermediate CFG in the client before handing it to Mu.

We need to

  1. translate the trace into a CFG in regular SSA using such a stateful builder (the stateful builder is easy to implement for SSA), then
  2. convert SSA into the goto-with-values form (in this process, figure out the live variables and the types of them), then
  3. call the low-level muapi.h API (or its RPython counterpart) to build a bundle inside Mu.

The first two steps happen within the client, so the CFG in SSA or goto-with-values form needs to be represented in the client's language. For RPython, it is a soup of objects (FuncVer, BasicBlock, Instruction, ..., or just list of lists of instructions).