Open j-hui opened 3 years ago
On the subject of implementing output variables, I came up with what I think is a reasonable eager output implementation that doesn't rely on vtables or special output ref types, i.e., the type implementations and codegen remain entirely agnostic to whether a variable happens to affect the external environment.
Here's some pseudocode for what the tick
function looks, currently:
# start tick by performing all delayed assignments for current time
while event_queue.head().time is now:
let sv = event_queue.dequeue_head()
# perform update
sv.update() # virtual function
# wake up listening processes
for t in sv.triggers:
act_queue.enqueue(t.act)
# etc..
What we want is the value update to be "felt" by the external environment only if an event is an output variable, by calling out to some platform-specific output handler. So far, implementation suggestions have focused on pushing that into the sv
's update
function, or deferring the output handler to take place in a continuation (i.e., after # etc..
).
My suggestion is to add the output handler to a scheduled variable's trigger list, and then invoke that output handler while iterating through the trigger list:
# wake up listening processes
for t in sv.triggers:
case t:
Act (a) -> act_queue.enqueue(a)
OutputHandler (o) -> o()
This runs before any continuations during that instant, meaning it is still eager, but doesn't rely on update
being virtual. Notably, this means we can do the exact same thing during instantaneous assignment, without needing to make assign
virtual. In fact, this requires no modifications to the data type implementations. Also, it doesn't require any extra memory: with some careful bit hacking, we can union the struct act *
and the output handler function pointer in the same field in struct trigger
.
The cost we pay is an extra branch to check each trigger, though I think that can be optimized too (we should only need to check for an output handler once). We also still need to figure out from what interface we obtain these output references, since those will need to add this special trigger to the trigger list, but that's somewhat orthogonal to this discussion.
The solution you pose sounds cool, and I think it would be fun to see how such an implementation looks like, but I am thinking about the problem you initially posed.
In the example program, where m
forks functions f
& g
, is that really a compiler issue? I would think that that's an application error, i.e a poorly written program. All other programming languages surely have the same problem? Or maybe I am missing the point. It seems like we have been too careless with our resources, and now we have two processes that wish to modify the same resource at the same time.
Other "asynchronous" languages usually leave the behavior of data races undefined, or at least nondeterministic, but only because they don't have the total ordering between processes we have (which, to me, is the point of our programming model).
My worry is that, with an eager implementation, an external effect is duplicated, for the same logical time, since the output handler may be invoked multiple times. It just seems inconsistent to say that this behavior is somehow undefined when we give such strong guarantees about the behavior of "races" on variables that aren't output variables.
I realize that we talk about I/O implementation as somehow one topic, but the implementation for input vs output turn out to be rather orthogonal. Thus, I'm starting this issue to specifically discuss output variables (and will open another issue to discuss input variables when the issues arise).
We want SSM programs (and their generated code) to be I/O agnostic, but it leads to this question: what should the behavior of the following program be:
Within SSM's semantics,
o
should have the valueTrue
at the end of the instant wherem
is called (assuming nothing else assigns too
), since the priority ofg
is lower thanf
due to how the two child processes are forked bym
.However, it's somewhat unclear what the behavior should be if
o
is an output variable. If we were to naively overload the assignment operation and eagerly transmit its value to the external environment, the environment would briefly see a glitch value ofFalse
before seeing the value settling toTrue
, even though the semantics of the language say that these updates are happening at the same logical time. Worse, if the output variable has stream/continuous semantics (e.g., it represents a socket file descriptor), bothFalse
andTrue
would be emitted (rather than overwriting one another), apparently at the same time.(Note, though, that this does not happen for delayed assignments, because SSM's semantics specify that there can be at most one scheduled update in the event queue for each scheduled variable.)
To me, the intuitive behavior of only emitting
True
at the end of the instant suggests to me that we should take a "lazy" approach to output, where the intra-instant updates are accumulated in the scheduled variable, and only emitted at the end of the instant. But this incurs more latency than necessary, especially in the presence of long-running, lower priority continuations; that latency is less than desirable for a real-time language.A more extreme option is to make instantaneous assignments to output completely non-effectful, which I am interested in entertaining. After all, what does it even mean to instantly compute and emit output?