The interpreter definition DSL currently has these two syntactic forms:
definition:
kind "(" NAME "," stack_effect ")" "{" C-code "}"
|
kind "(" NAME ")" "=" uop ("+" uop)* ";"
However, these two forms are incompatible with each other if we want to pretend that Python/bytecodes.c is a C file -- we can't define a dummy inst() macro that can be used in both contexts.
I propose to use different keywords for different syntax, in particular:
definition:
"inst" "(" NAME ("," stack_effect)? ")" "{" C-code "}"
|
"op" "(" NAME "," stack_effect ")" "{" C-code "}"
|
"macro" "(" NAME ")" "=" uop ("+" uop)* ";"
|
"super" "(" NAME ")" "=" NAME ("+" NAME)* ";"
Thus, inst and op always have a block of C code; macro and super always combine other instructions or opcodes. (An uop is an op name or a cache effect.)
(An inst without stack effect is a legacy instruction.)
The difference between super and macro is in how they are dispatched: a super-instruction has 2 or more opargs and is encoded as 2 or more code units (and moreover, jumping to the second of these will execute the second half of the super-instruction). A macro instruction has a single oparg and takes up a single code unit (not counting inline cache fields). Only macro can also take cache effects as input.
I also propose that inst, macro and super always define top-level bytecode instructions. (It is already the case that op always defines a building block.) super can only combine top-level bytecode instructions; macro can only combine ops and stack effects. I doubt we'll need macros as input to other macros.
PS: If and when we switch to a register machine these things will have to change anyway.
The interpreter definition DSL currently has these two syntactic forms:
However, these two forms are incompatible with each other if we want to pretend that Python/bytecodes.c is a C file -- we can't define a dummy
inst()
macro that can be used in both contexts.I propose to use different keywords for different syntax, in particular:
Thus,
inst
andop
always have a block of C code;macro
andsuper
always combine other instructions or opcodes. (Anuop
is an op name or a cache effect.)(An
inst
without stack effect is a legacy instruction.)The difference between
super
andmacro
is in how they are dispatched: a super-instruction has 2 or more opargs and is encoded as 2 or more code units (and moreover, jumping to the second of these will execute the second half of the super-instruction). A macro instruction has a single oparg and takes up a single code unit (not counting inline cache fields). Onlymacro
can also take cache effects as input.I also propose that
inst
,macro
andsuper
always define top-level bytecode instructions. (It is already the case thatop
always defines a building block.)super
can only combine top-level bytecode instructions;macro
can only combine ops and stack effects. I doubt we'll need macros as input to other macros.PS: If and when we switch to a register machine these things will have to change anyway.