faster-cpython / ideas

1.68k stars 48 forks source link

Change the interpreter definition DSL to be legal C syntax #477

Closed markshannon closed 1 year ago

markshannon commented 1 year ago

The proposed DSL for defining the interpreter has sound semantics, but the syntax is problematic due to lack of tooling. We want to be able to edit and view the code in standard tools.

The 3.11 (and earlier) instructions definitions are understood by tools that understand C, we want to retain that.

To do that we need to change the DSL to be a subset of C, as follows:

From the examples,

inst LOAD_ATTR_SLOT = #counter CHECK_OBJECT_TYPE LOAD_SLOT ####unused;

becomes

inst(LOAD_ATTR_SLOT) = counter*1 + CHECK_OBJECT_TYPE + LOAD_SLOT  + unused*4;

Alternatively we could define S1, S2 and S4 for values in the instruction stream:

inst(LOAD_ATTR_SLOT) = S1(counter) + CHECK_OBJECT_TYPE + LOAD_SLOT  + S4(unused);

The definition of LOAD_FAST changes from:

inst LOAD_FAST ( -- value ) {
        value = frame->f_localsplus[oparg];
        Py_INCREF(value);
    }

to:

inst(LOAD_FAST)
/* -- value */
{
    value = frame->f_localsplus[oparg];
    Py_INCREF(value);
}
gvanrossum commented 1 year ago

Looks good, although I'd prefer to come up with a solution where the stack effect (both what it takes from the stack and what it leaves behind) are arguments to the macro. Maybe this would work?

inst(LOAD_FAST, (), (value)) {
    value = frame->f_localsplus[oparg];
    Py_INCREF(value);
}

When preceded by e.g.

#define inst(x, y, z) // nothing

VS Code has no problem with that.

markshannon commented 1 year ago

Maybe use a two argument macro, that way we can keep the stack comment format

#define inst(x, y) void inst_ ## x(void)

inst (LOAD_FAST, -- value)
{
    value = frame->localsplus[oparg];
    Py_INCREF(value);
}
gvanrossum commented 1 year ago

Sure, that'll work, assuming the DSL lives in a separate file. I'll try to update the definition file.

gvanrossum commented 1 year ago

For streams, I propose NAME/1, NAME/2, and NAME/4. Valid C and feels vaguely more intuitive than *4.