De-linearize executions

ELLIOTTCABLE commented 10 years ago

At the moment, a Paws system is very non-linear (that, heh, being kind of the point), basically operating on a graph of operations. However, that graph is encoded as a set of Executions, which in the current design are strictly linear: an Execution is a series, not a graph, of operations. The graph-nature is only exposed in the interactions between Executions at runtime.

I'd like to explore non-linear instruction graphs. That is, Executions that can represent several pending operations simultaneously; when resumed, all nodes depending on the ‘current’ one at which the Execution was paused are now valid for evaluation by the machine.

Tentatively, I'd like to avoid multiple-data-flow through the instruction graph: If you want to re-use some piece of data, some result, multiple times, then I'd rather encourage algorithms that store that reference in the data-graph in a retrievable way (‘just put it in a variable.’) So, basically, I envision Executions in which each node has one-to-many dependency relationships, but only a single data-flow relationship. That is, when node A completes, then the result thereof continues to flow to node B (as in the current Executions), but nodes M and X may also depend on A, without the data-flow relationship, and thus be evaluated concurrently to B.

This can be expressed in a ‘Tires-y’ syntax, as previously discussed elsewhere. As mentioned in #16, I'd like to extend this with SSA to allow encoding of more possible useful dependency graphs.

ELLIOTTCABLE commented 10 years ago

The most obvious problem with this change, from where I'm standing, is that it, well, “breaks Executions.” The famous old move-foward Execution semantics upon which huge swaths of Paws' semantics depend make little or no sense here: an Execution can only ‘move forward’ to next node of a Script, if that Script is strictly linear (thus providing such a “next node” to move to.) With non-linear Scripts, ‘Executions’ make little sense anymore.

This isn't, in and of itself, a problem; out with the old, in with the new, and all that. (As long as the new is better, presumably.) However, breaking Executions has several direct consequences which do bother me:

Coproduction is out the window. Specifically, coproduction as a pattern, is a beautiful hack that both works around, and takes advantage of, the intentional lack of implicit continuation-passing by the Paws machine. However, it can only work because the Execution for the calling code that is passed initially is assumed to move-forward with that calling code, such that it is always pointing at the ‘remainder’ of the caller, that which comes after the current portion of the call in question.

I see a couple ways around this:
- Trashing Executions (in the move-forward sense) entirely, calling them Continuations or similar, and then re-designing Paws to involve traditional implicit passing of the current continuation. That is, making every single staging (resumption) include the caller (resumer) as an implicit argument. (This makes very little sense in the Paws framework; for instance, where does that implicit argument go? Resumption-values are the only value passed around in the Paws system, and there's only one ‘hole’ to be filled on a resumption. Would need further thought.) I dislike this immensely; see 2. below.
- Retain move-forward semantics to the extent that is possible in a generally non-linear environment. Specifically, since I above note that I'd prefer to retain linear data-flow, we could design the Execution type to move-forward along only the current data-flow. This works, but only to a point; and as described in point 2., it reduces expressive power from what is currently available immensely. (Simple example: once you ‘close’ a flow data dependency, i.e. end a line, any Execution held by an arbitrary callee is now invalid. This means there's no way to coproduce arguments across multiple lines.)
In the more general case, I'm very attached to the sorts of abstractions I believe can be built on top of move-forward Executions. I see Paws as a grand symphony of instruction-sets, Executions, passing data back-and-forth following various friendly patterns (of which coproduction/coconsumption are one example) … and this breaks that. I don't yet see how most of those can be implemented using unaddressable (see below), non-linear instruction graphs.

Note: Many of these concerns might be solvable by addressable instructions: #18

devyn commented 10 years ago

I like the idea of using a single-consumption SSA type thing. Perhaps we can still have coproduction if we do something like this? (Excuse the horrible syntax.)

implementation console trace[] Hi;
one: foo[] a b c d;
two(one): implementation console trace[] Yo!;
three: infrastructure affix[] [$two] $one

Summary: statements can have a preamble, which can give them a "name" and/or specify which other named statements they order-depend upon.

[] refers to the hole in that statement - i.e. individual statements are linear. This allows coproduction to occur.

The "name" can be referenced with form $name which refers to the last data-value returned from the statement (i.e. at the end). The name is lexically scoped and its value may only be consumed once (dynamic scoping via locals must be used in other cases). This causes the statement to wait for that data value as soon as it needs to pass it out, but is not the same as adding an order dependency - it will use it as soon as it's available and doesn't cause the entire statement to wait.

If the statement's preamble includes the form (a b c...), then statements a, b, and c must complete before the statement begins to execute.

Essentially, I think, this ends up turning each individual statement into its own Execution, effectively, in a way that's encoded into the Script. The names would be purely for lexical purposes; a compiled Script would probably only have SSA-style numbers or something.

Of course, statements don't need to be named to have an order dependency; one can write:

one: foo[] a b c d;
(one): implementation console print "foo completed"

I'm sure there are ways to make this less syntactically complex, but I think it is still inherently abstractable which is the main thing: someone could modify the parser (dynamically, of course) to always generate order dependencies on the statement above and in doing so create an imperative/synchronous language.

Basically, given

foo: some operation[]

The following will cause data dependency once execution of this statement reaches $foo (but it will be allowed to execute until that point):

other operation[] $foo

And the following won't depend on the data from foo but will wait until that entire statement completes until this statement can start:

(foo): other operation[]

And then this statement does both; it waits for foo to complete and depends on the data from it.

(foo): other operation[] $foo

So, no, these aren't explicitly data related; they're really just "named statements" and the (...) patttern and $... pattern are order dependency and data dependency respectively.

PratchettLovesYou / Issues

De-linearize executions #17