Open markshannon opened 1 year ago
Please bear with me as I'm quite confused. What keeps track of the shadow stack (how does the shadow stack suddenly know that typing
or other things are on the stack)? Or is it just another operand stack?
Hopefully this helps anyone trying to understand this better. After reading Mark's PhD theses https://theses.gla.ac.uk/2975/ , this is what I surmise:
Ref page 114 (5.7, Deferred Object Creation): What's a shadow frame stack?
"The DOC pass defers creating objects for as long as possible. To do this, it maintains a shadow data stack and a shadow frame stack to record objects that it is currently deferring. The shadow data stack and a shadow frame stack record the difference between the original, non-deferred state and the actual, deferred state. When the DOC pass encounters an instruction that would create a new object which would be of a type that the DOC pass understands, such as tuple, then a deferred object is pushed to the shadow stack. The DOC pass also maintains a shadow line number."
(Shannon, 2011)
What happens at boundaries between shadow stack objects and operand stack objects
"There are a number of instructions that the DOC pass understands but cannot defer. To handle these, the DOC pass is able to mix objects that have already been created with deferred ones. For example, if the DOC pass encounters an i_add instruction it must ensure that the top two values on the stack actually exist, emitting the code to create any deferred objects. It then emits the i_add instruction and pushes a marker to the deferred stack, showing that the object on top of the shadow stack corresponds to the one on top of the real stack."
(Shannon, 2011)
So in the example Mark gives above, it is effectively seeing:
x = typing.cast(int, x)
return unpredictable # return or some other exit, not sure what EXIT is, unpredictable is a module global
and deciding that, after inlining typing.cast
, the whole intermediate op is effectively a no-op, and just the only thing that really matters is the return unpredictable
. Thus x = typing.cast(x, int)
becomes a no-op.
In short, partial evaluation, to my understanding, is when the optimizer can determine the net input and output effects, and optimize away any known intermediate operations. Really cool idea!
Everything you could possibly want to know about partial evaluation (and then some): https://www.itu.dk/~sestoft/pebook/pebook.html
In this case we are partial evaluating the function interp(sb, ctx)
, where intepr
is the interpreter, sb
is the unoptimized superblock and ctx
is the executing context (known types, etc) to produce a residual program opt
.
Full partial evaluation would include compilation, such that opt(ctx) == interp(sb, ctx)
, but we want to produce a program for a tier2 interpreter interp2
, such that interp2(opt) == interp(sb, ctx)
.
So this is a limited form of partial evaluation, but many of the same concepts apply.
I've read enough literature to understand this properly now. Here's a proposed definition/invariant/pseudo-algo:
We shall start with a simple optimization: e.g. LOAD_FAST, POP_TOP or constant propagation will do.
flowchart BT
A(Bottom, i.e. empty) --> B(S)
A --> C(D)
B --> E(S,S)
B --> D(S, D)
C --> D
C --> F(D, D);
Relevant paper https://dl.acm.org/doi/10.1145/1929501.1929508
This pass defers the creation and update of objects, including the frame stack.
By maintaining a "shadow frame stack" we can defer operations that create new objects, or modify the frame stack, only materializing the frame stack when needed.
Doing so for
SAVE_IP
is trivial, we just record the currentprev_instr
on the shadow frame. I will omit theSAVE_IP
s for clarity in the following example.Taking the code from the specialization example:
This produces the following superblock with commented shadow frame stack:
Which after the cleanup pass is:
This is obviously a somewhat contrived example, but you get the idea.