Partial functions are implemented by incrementally constructing a callable on the stack, called a thunk. The distinction between a thunk and a plain function is implementation-specific, but in particular, a block is a statically known sequence of instructions, while a thunks can in principle be constructed at runtime or at compile time.
Thunks are first-class values. They can be applied, like ordinary functions, since they are callable. But they can also compose at the instruction level. Thunks are built up incrementally on the stack by applying operations to thunk operands. Only calling a thunk leaves an ordinary (non-thunk) value on the stack.
Syntax
$ - each instance creates a distinct parameter
$..., rest place holder, can appear at most once, in the final argument position.
Extended syntax
These are intuitive, but might be tricky to implement as described below.
$0, $1... etc, positional place holders, each can appear more than once. arity is implied by max() over all indices.
$0..3, slicing placeholder, can appear more than once.
$3.., slicing rest place holder.
Operations on Thunks
Any instruction can consume one or more thunk operands, and the result will yield a thunk.
Informal Examples
The identity thunk is written as thunk([arg(0)], 1, 1, []). This is a thunk which simply returns is argument unchanged when called. The placeholder instruction simply places this thunk on the stack.
Here are some examples to get a feel for how thunks are constructed incrementally. Here $ is used as shorthand for the placeholder instruction.
$ => thunk([arg(0)], 1, 1)
$ plus => stack underflow
$ $ => thunk([arg(0)], 1, 1)thunk([arg(0)], 1, 1)
$ $ plus => thunk([arg(0) arg(1) plus], 2, 1)
$ dup => thunk([arg(0) dup], 1, 2)
$ 1 plus => thunk([arg(0) 1 plus], 1, 1)
1 $ plus => thunk([1 arg(0) plus], 1, 1)
1 $ plus 2 times => thunk([1 arg(0) plus 2 times], 1, 1)
1 $ plus $ times => thunk([1 arg(0) plus arg(1) times], 2, 1)
Special case for partially applied function calls:
In general, other than arg(x, y), an arbitrary instruction X can be lifted to a thunk:
lift(X) => thunk([X], 0, r(x), []), where and r(X) is the return arity (usually 1, but 0 in the case of out, and could be larger in the presence of de-structuring, for example), and [] is an empty capture list.
For arg instructions, this becomes: lift(arg(x, y)) => thunk([arg(1, 0)], 0, 1, [arg(x, y)]), i.e. the argument value is captured, and the resulting thunk takes no arguments and returns the captured value.
Note that the argument arity of the lifted instruction is always 0, which can result in degenerate thunks which cannot be called without causing a stack underflow. This is acceptable because the lifted thunks are never placed on the stack directly, but concatenated as described below. Defining lifting this way is crucial for this definition of concatenation.
The well-formedness of the resulting thunk depends on the well-formedness of the instruction sequence producing the thunk.
Instructions consuming thunk operands
Using the definitions above, we can extend arbitrary instructions to support thunk operands.
Here t<n> represents an arbitrary thunk operand, and v denotes an arbitrary non-thunk value:
unary: t0 X => t0 ++ lift(X)
binary:
t0 v X => t0 ++ lift(v) ++ lift(X)
v t0 X => lift(v) ++ t0 ++ lift(X)
t0 t1 X => t0 ++ t1 ++ lift(X)
Ternary and higher cases omitted here for brevity, but they amount to a straight-forward extension of this pattern. An operand is either a thunk, or a plain value. Plain values and instructions are lifted to thunks, all thunks concatenated, and the result placed on the stack.
General thunk concatenation
The "++" operation on thunks denotes concatenation.
For any two thunks left = thunk(lblock, largc, lretc, lcaptures), and right = thunk(rblock, rargc, retc, rcaptures), their concatenation left ++ right is as defined as:
left ++ right = thunk(lblock ++ rblock.map(adjust_args), largc + rargc, rretc, lcaptures ++ rcaptures), where adjust_args is defined below.
That is to say: instruction blocks are concatenated, adjusting argument indices as needed. The argument arity of the resulting thunk is the sum of the left and right argument arities, while the return arity is simply that of the right thunk (it is assumed that the right thunk's code block consumes any values which the left thunk leaves on the stack). Capture lists are also concatenated.
Argument Adjustment
Argument frame 0 denotes the thunk's formal parameters. Argument frame 1 is used for captured thunk values.
Most instructions on the right hand side concatenate into a thunk unchanged; however, arg instructions must be adjusted to bind the correct argument inside the thunk.
If an instruction of the form arg(0, x) appears in the right-hand thunk, it becomes arg(0, x + largc) in the resulting thunk.
If an instruction of the form arg(1, y) appears in the right-hand thunk, it becomes arg(1, y + lcaptures.len())
arg(x, y), where x > 1 are concatenated without modification.
Special Case: Thunks as Function Arguments
TBD, expand on how this can be optimized by merging thunks eargerly. May be more appropriate for compile-time vs runtime.
Interaction with const-folding
Thunk concatenation can occur at compile-time during const-folding.
Const-folding may happen in the front-end at the AST level, but it should result in the same instruction sequences in generated code. Moreover, const-folding is allowed to apply thunks when all arguments are known, so the result may not appear in the generated code at all in this case.
Partial Evaluation
Partial functions are implemented by incrementally constructing a callable on the stack, called a thunk. The distinction between a thunk and a plain function is implementation-specific, but in particular, a block is a statically known sequence of instructions, while a thunks can in principle be constructed at runtime or at compile time.
Thunks are first-class values. They can be applied, like ordinary functions, since they are callable. But they can also compose at the instruction level. Thunks are built up incrementally on the stack by applying operations to thunk operands. Only calling a thunk leaves an ordinary (non-thunk) value on the stack.
Syntax
$
- each instance creates a distinct parameter$...
, rest place holder, can appear at most once, in the final argument position.Extended syntax
These are intuitive, but might be tricky to implement as described below.
$0
,$1
... etc, positional place holders, each can appear more than once. arity is implied by max() over all indices.$0..3
, slicing placeholder, can appear more than once.$3..
, slicing rest place holder.Operations on Thunks
Any instruction can consume one or more thunk operands, and the result will yield a thunk.
Informal Examples
The identity thunk is written as
thunk([arg(0)], 1, 1, [])
. This is a thunk which simply returns is argument unchanged when called. Theplaceholder
instruction simply places this thunk on the stack.Here are some examples to get a feel for how thunks are constructed incrementally. Here
$
is used as shorthand for theplaceholder
instruction.$
=>thunk([arg(0)], 1, 1)
$ plus
=> stack underflow$ $
=>thunk([arg(0)], 1, 1)
thunk([arg(0)], 1, 1)
$ $ plus
=>thunk([arg(0) arg(1) plus], 2, 1)
$ dup
=>thunk([arg(0) dup], 1, 2)
$ 1 plus
=>thunk([arg(0) 1 plus], 1, 1)
1 $ plus
=>thunk([1 arg(0) plus], 1, 1)
1 $ plus 2 times
=>thunk([1 arg(0) plus 2 times], 1, 1)
1 $ plus $ times
=>thunk([1 arg(0) plus arg(1) times], 2, 1)
Special case for partially applied function calls:
1 $ func([...arg(0, 0) ... arg(0, 1)...], 2, 1) call
=>thunk([... arg(1, 0) ... arg(0, 0) ...], 1, 1, [1])
Formal Description
Thunk Monoid
Thunks form a monoid with
$
and concatenation.Instruction Lifting
In general, other than
arg(x, y)
, an arbitrary instruction X can be lifted to a thunk:lift(X)
=>thunk([X], 0, r(x), [])
, where andr(X)
is the return arity (usually 1, but 0 in the case ofout
, and could be larger in the presence of de-structuring, for example), and[]
is an empty capture list.For arg instructions, this becomes:
lift(arg(x, y))
=>thunk([arg(1, 0)], 0, 1, [arg(x, y)])
, i.e. the argument value is captured, and the resulting thunk takes no arguments and returns the captured value.Note that the argument arity of the lifted instruction is always 0, which can result in degenerate thunks which cannot be called without causing a stack underflow. This is acceptable because the lifted thunks are never placed on the stack directly, but concatenated as described below. Defining lifting this way is crucial for this definition of concatenation.
The well-formedness of the resulting thunk depends on the well-formedness of the instruction sequence producing the thunk.
Instructions consuming thunk operands
Using the definitions above, we can extend arbitrary instructions to support thunk operands.
Here
t<n>
represents an arbitrary thunk operand, andv
denotes an arbitrary non-thunk value:t0 X
=>t0 ++ lift(X)
t0 v X
=>t0 ++ lift(v) ++ lift(X)
v t0 X
=>lift(v) ++ t0 ++ lift(X)
t0 t1 X
=>t0 ++ t1 ++ lift(X)
Ternary and higher cases omitted here for brevity, but they amount to a straight-forward extension of this pattern. An operand is either a thunk, or a plain value. Plain values and instructions are lifted to thunks, all thunks concatenated, and the result placed on the stack.
General thunk concatenation
The "++" operation on thunks denotes concatenation.
For any two thunks
left
=thunk(lblock, largc, lretc, lcaptures)
, andright
=thunk(rblock, rargc, retc, rcaptures)
, their concatenationleft ++ right
is as defined as:left ++ right
=thunk(lblock ++ rblock.map(adjust_args), largc + rargc, rretc, lcaptures ++ rcaptures)
, whereadjust_args
is defined below.That is to say: instruction blocks are concatenated, adjusting argument indices as needed. The argument arity of the resulting thunk is the sum of the left and right argument arities, while the return arity is simply that of the right thunk (it is assumed that the right thunk's code block consumes any values which the left thunk leaves on the stack). Capture lists are also concatenated.
Argument Adjustment
Argument frame 0 denotes the thunk's formal parameters. Argument frame 1 is used for captured thunk values.
Most instructions on the right hand side concatenate into a thunk unchanged; however,
arg
instructions must be adjusted to bind the correct argument inside the thunk.arg(0, x)
appears in the right-hand thunk, it becomesarg(0, x + largc)
in the resulting thunk.arg(1, y)
appears in the right-hand thunk, it becomesarg(1, y + lcaptures.len())
arg(x, y
), wherex
> 1 are concatenated without modification.Special Case: Thunks as Function Arguments
TBD, expand on how this can be optimized by merging thunks eargerly. May be more appropriate for compile-time vs runtime.
Interaction with const-folding
Thunk concatenation can occur at compile-time during const-folding.
Const-folding may happen in the front-end at the AST level, but it should result in the same instruction sequences in generated code. Moreover, const-folding is allowed to apply thunks when all arguments are known, so the result may not appear in the generated code at all in this case.