Open masak opened 8 years ago
- The implementation of
next
would then beif loopcondition { ctrlflow.goto(findtag("loop").beginning); } else { ctrlflow.goto(findtag("loop").end); }
I'm subtly and dangerously wrong above.
A next
is not a simple goto
. Both for
and while
loop blocks can accept parameters, and these need to be bound to their new values. The simplest way to do that, of course, is to cleanly exit the current iteration, and to call the block again with the new parameters.
It may seem a bit silly to provide .next
as a method on ctrlflow
, but that might be the easiest way to write the implementation of next
in a sensible way:
ctrlflow.next();
Doing it this way would require slightly more than a loop
tag to be attached to loop blocks; they'd also need a "next
handler".
We could rewrite the runtime code so that it calls into the same handler during normal loop control flow. In the fullness of time, we might even be able to provide the loops themselves as statement macros, using the same API from with 007.
Note that, even if we make the feature totally static and lexical, we will still have interaction between next
et al. and the call stack:
for [1, 2, 3] -> e {
sub foo() {
if e == 2 {
next; # needs to unwind the call stack to the loop block
}
}
foo();
say(e); # 1\n3\n
}
And, since the call stack is dynamic and subs are first-class, this could also happen:
my fn;
for [1, 2, 3] -> e {
fn = sub () {
next; # lexically inside the loop block, but not dynamically
};
}
fn(); # runtime error: attempt to `next` a completed loop
Coming back to this issue and re-reading it, I say we might consider going the Python route and forbidding next
et al. if there's a layer of sub
between the loop and the next
.
Yes, that's a sad cop-out, I know. But (a) since Python does this, it's within 007's rights to choose that route too, and (b) having the conservative version of the feature early is more useful than not having the ambitious version of the feature for a long time, and (c) seeing the feature implemented in a more static way might make us realize things we otherwise would not.
This issue is bounded above in an interesting way by #166 and similar. When we compile to a different backend, we can no longer do very exotic things with the control flow. And since we want to have other backends, 007 also needs to be non-exotic enough so that code can be generated that corresponds to everything the control flow API can do.
When we compile to a different backend, we can no longer do very exotic things with the control flow.
But (as I come back to this later and re-read it) this should be perfectly fine considering what we want to do. It simply means that all the control flow we want to be able to generate needs to be translatable down into conditionals and loops, possibly on gensym'd boolean variables. My intuition tells me this is true of next
, last
, and redo
. It would also be true of something like #13.
Note, however, that it's not obviously true of the next
-and-call-stack problems outlined above. In fact, the dictum that things be directly translatable down to local conditionals and loops may well replace "Python doesn't do it" as the prime reason for 007 to make control flow entirely static.
Been thinking lately about how return
interacts with catch
and #156, and how that needs to be exposed in some sensible way. Essentially, return
doesn't really mean "return immediately" in the cases where some surrounding cleanup mechanism has been installed; rather, it means "hand things over to the nearest cleanup handler".
I have a feeling this could be expressed/designed very neatly in terms of basic blocks. Just need to find the time to sit down and do so.
Quite a bit of overlap here with #252 which I just filed.
When I re-read this issue, I feel what's missing from the OP is that control flow is an aspect of code generation. (That's also why I just closed #252.)
Getting down to brass tacks instead of trying to explain in an abstract manner, here's what I imagine a code generation API might look like:
if
statementscg.jump_unless(condition_expr, "else")
cg.gen(then_body);
cg.label("else");
if else_q ~~ Q::Statement::If {
# recursively generate an if statement here
}
else if else_q ~~ Q::Block {
cg.gen(else_q);
}
while
loopscg.label("next");
cg.jump_unless(condition_expr, "last");
cg.label("redo");
cg.register_loop(WHILE_LOOP, "next", "redo", "last", func() {
cg.gen(loop_body);
});
cg.jump("next");
cg.label("last");
try
statementscg.reroute_all_exits("finally", sub() {
cg.catch_exceptions("catch", func() { # *
cg.gen(try_body);
});
});
cg.gen(catch_blocks);
cg.label("finally");
cg.gen(finally_body);
Marked with an asterisk because the design is not all there. For one thing, the finally
block will need to be informed of whether a return
happened inside of the try
block, so that it can resume that return
(or override it) as appropriate.
with
statements (as in #156)my closable_ref = cg.gen(with_expression);
cg.reroute_all_exits("close", func() {
cg.gen(with_body);
});
cg.label("close");
cg.gen(closing_code(closable_ref));
amb
macro (as in #13)(Would just re-write to a for
loop, which would just be sugar for a while
loop. An assert
would code-gen to a conditional next
with the registered for
loop. The first parameter of register_loop
indicates what type of loop it is. We could introduce a unique symbol AMB_LOOP
, and have the assert
code generation target it.)
"next"
and there'll be no collision.cg
object makes sure that for each local bit of code, all labels are the target of at least one jump, and all jumps have a declared label. It's not exactly Structured Programming, but it's, um, type-safe assembly. No sense in not making life comfortable for ourselves.use cg;
inside their module, and just make cg
calls like we've done above. The compiler could intercept them and code-generate accordingly.Note that even the conservatism suggested in https://github.com/masak/007/issues/145#issuecomment-232278144 is not going to suffice. Why? Because blocks with parameters code-generate as separate "units" of callable code in the general case, because that's the only way closures inside of them can work.
I didn't realize this until I started thinking about bytecode and trying to transcribe our example programs into bytecode.
I'm still mulling on it all, but I think this pushes us towards treating next
et al. less like a goto and more like a control exception.
We're still free to lock it down into being syntactically lexical, but (again, from what I can see so far) there would be no simplification in semantics in the general case from doing so.
Just driving by to point out that Symbol
(#312) would be a really good fit for constructing labels at the bytecode level.
- The above is mainly meant for the innards of a 007 compiler. But as an added bonus, macro authors could
use cg;
inside their module, and just makecg
calls like we've done above. The compiler could intercept them and code-generate accordingly.
I was reminded of this phrase ("The compiler could intercept them") as I read Dan Abramov's React as a UI Runtime, especially the section about Inversion of Control. To wit,
React.createElement
call, which is controlled by React so that React can defer and otherwise strategically call the component when appropriate.It seems to me we want the same sort of layer for cg
calls; that is, just like React tooling enjoys knowing about the component hierarchy, 007 tooling enjoys knowing about the implicit "flow chart" generated by different Qnodes. Just as the React renderer reserves the right to lazily evaluate components, so the 007 code generator sometimes chooses to code-generate some bits but not others.
After several years of waiting for the other shoe to drop, I now think that the control API should be phrased in terms of continuations. Hoping to be able to flesh that out in a concrete way soon, but basically:
if
statement receives its continuation Done
, and then constructs a control flow Start -> Then -> Done
and Start -> Else -> Done
; the code in the Then
and Else
parts can be opaque to the if
statement during code generation.for
loop or a while
loop concerns itself with code-generating the loop-controlling parts, and has an opaque body
that it can code-generate for. The twist, though, is that it sends its End
continuation down into a context surrounding the body
so that a last;
statement inside the body can invoke that continuation. Similarly for next
and redo
.Again, this is but a rough sketch of the idea. Continuations will generally be compiled away into weaker constructs when at all possible. Either we rule out the remaining cases (which was basically the idea with forbidding loop jumps through function boundaries, Python-style), or we accept that continuations do exist sometimes, and basically base our runtime architecture on them. I'm not in a rush to do the latter, but I also think it wouldn't be so bad. Basically Alma could run on a CESK machine.
Maybe I'm not really contradicting 2021-me when I say that it feels like we're essentially talking about algebraic effects here. Maybe the connection is this: if you have continuations and a CESK machine, then algebraic effects can be phrased in terms of having a bit more access to the machine itself, especially the K part.
I would like to be able to provide
next
,last
, andredo
in loops as statement macros. But I would also like the underlying mechanism that they lean upon to be exposed as a clean API.What would the API look like? Something like this:
for
andwhile
loops would register their loopQ::Block
in some way with the tagloop
.ctrlflow.goto(location)
.next
would then beif loopcondition { ctrlflow.goto(findtag("loop").beginning); } else { ctrlflow.goto(findtag("loop").end); }
last
would bectrlflow.goto(findtag("loop").end);
redo
would bectrlflow.goto(findtag("loop").beginning);
I think we don't need to make this mechanism dynamic (like in Perl 6) — in fact, it's easier to statically check if we don't. Which also means that
.map
won't be considered a loop in the above sense. On the other hand, if someone wanted to write a custom loop, they'd simply provide it with the tagloop
, and it'd play along.