Change the interpreter definition DSL to be legal C syntax

markshannon commented 1 year ago

The proposed DSL for defining the interpreter has sound semantics, but the syntax is problematic due to lack of tooling. We want to be able to edit and view the code in standard tools.

The 3.11 (and earlier) instructions definitions are understood by tools that understand C, we want to retain that.

To do that we need to change the DSL to be a subset of C, as follows:

The definition file will start with macro and function declarations to enable C tooling to understand the following definition. Our tooling will ignore that section.
Definitions will change from kind NAME ... to kind "(" NAME ")" ...
The stack effect will be embedded in a C comment ( -- value) becomes /* -- value */
Sequences will be need to become C expressions, so CHECK_OBJECT_TYPE LOAD_SLOT becomes CHECK_OBJECT_TYPE + LOAD_SLOT
Stream values will be indicated by multiplication, #index becomes index*1 and ##version becomes version*2.

From the examples,

inst LOAD_ATTR_SLOT = #counter CHECK_OBJECT_TYPE LOAD_SLOT ####unused;

becomes

inst(LOAD_ATTR_SLOT) = counter*1 + CHECK_OBJECT_TYPE + LOAD_SLOT  + unused*4;

Alternatively we could define S1, S2 and S4 for values in the instruction stream:

inst(LOAD_ATTR_SLOT) = S1(counter) + CHECK_OBJECT_TYPE + LOAD_SLOT  + S4(unused);

The definition of LOAD_FAST changes from:

inst LOAD_FAST ( -- value ) {
        value = frame->f_localsplus[oparg];
        Py_INCREF(value);
    }

to:

inst(LOAD_FAST)
/* -- value */
{
    value = frame->f_localsplus[oparg];
    Py_INCREF(value);
}

gvanrossum commented 1 year ago

Looks good, although I'd prefer to come up with a solution where the stack effect (both what it takes from the stack and what it leaves behind) are arguments to the macro. Maybe this would work?

inst(LOAD_FAST, (), (value)) {
    value = frame->f_localsplus[oparg];
    Py_INCREF(value);
}

When preceded by e.g.

#define inst(x, y, z) // nothing

VS Code has no problem with that.

markshannon commented 1 year ago

Maybe use a two argument macro, that way we can keep the stack comment format

#define inst(x, y) void inst_ ## x(void)

inst (LOAD_FAST, -- value)
{
    value = frame->localsplus[oparg];
    Py_INCREF(value);
}

gvanrossum commented 1 year ago

Sure, that'll work, assuming the DSL lives in a separate file. I'll try to update the definition file.

gvanrossum commented 1 year ago

For streams, I propose NAME/1, NAME/2, and NAME/4. Valid C and feels vaguely more intuitive than *4.

faster-cpython / ideas

Change the interpreter definition DSL to be legal C syntax #477