Open kjin opened 6 years ago
/cc @ofrobots
@kjin - thanks for the detailed write-up. Digesting this now. /cc @mrkmarron
OK, so some comments - looking forward to discussing f2f, that will definitely help here.
First off, great visualizations, they absolutely help in driving the conversation here.
Take a look here, which is what I'm going to PR into the diagnostics repo today (Edit: PR 197). It's an attempt at refinement of what we've been talking about, particularly around how to effectively communicate the concepts. Apologies for any waffling on terminology (I'm still playing around to try to land on something that I think resonates). Of particular relevance here is I try to be crisp about defining concepts in terms of "continuations on the stack". I think it addresses some of the questions about how user-space-queuing will look.
Perhaps this is just editorial from me, but I would argue that from the user's point of view, FSREQWRAP is an implementation detail, and doesn't represent the "async model".
I'm confused by your statement that "Note that at the marked line, we are actually in two (nested) execution frames;". We should discuss in more detail. In my mind, fs.readFile(...)
executes synchronously, and is off the stack entirely when file 1 is opened.
RE ", I believe that distinctions between ready and linking context are not necessary, because they always correspond to lower-level and higher-level continuations respectively", I need to think this through in more detail, but one observation: your conjecture is only true when both continuations are on the stack. However, if they are off the stack, then "completeness" of the graph requires the "ready/causal context". Perhaps the model can be tweaked to maintain a link to the "parent continuation on the stack", and perhaps from this we can infer the "ready/causal context". i.e., currentContext->causalContext == currentContext->parentContext->linkingContext
Thanks for looking through this!
Perhaps this is just editorial from me, but I would argue that from the user's point of view, FSREQWRAP is an implementation detail, and doesn't represent the "async model".
You're right that it is an implementation detail to users of fs.readFile
. However, from a holistic point of view it is impossible to automatically determine what a developer considers host code vs implementation details. Is it safe to draw the line at node_modules
, or at the Node API boundary, or at the native-JS boundary? I think drawing the line anywhere higher-level than at the native-JS boundary is making assumptions that might affect people in unexpected ways.
I'm confused by your statement that "Note that at the marked line, we are actually in two (nested) execution frames;". We should discuss in more detail. In my mind, fs.readFile(...) executes synchronously, and is off the stack entirely when file 1 is opened.
fs.readFile
seems like an abstraction over a file stream, such as fs.createReadStream
. So when the file has been opened, we are in a higher-level continuation corresponding to the continuation point fs.readFile
, and a lower-level continuation corresponding to the continuation point fs.createReadStream
.
I believe that distinctions between ready and linking context are not necessary, because they always correspond to lower-level and higher-level continuations respectively", I need to think this through in more detail, but one observation: your conjecture is only true when both continuations are on the stack. However, if they are off the stack, then "completeness" of the graph requires the "ready/causal context".
In userspace queueing cases (which I believe are the root cause of divergence between ready and linking context) having only one continuation on the stack gives an incomplete async call graph. The userspace queueing library author would need to manually use AsyncResource
(or similar) to fill in the gap to add an extra continuation to the stack. This is probably closely related to how cause
/link
wrapper functions are supposed to be used.
Quick comment
fs.readFile seems like an abstraction over a file stream, such as fs.createReadStream. So when the file has been opened, we are in a higher-level continuation corresponding to the continuation point fs.readFile, and a lower-level continuation corresponding to the continuation point fs.createReadStream.
We haven't been very crisp about what needs to happen at the Continuation Point
boundaries. My take on this is that in general, Continuation Point
implementations will "continuify" their arguments, e.g.:
function continuify(f) {
if (f instance of Continuation) {
return f;
} else {
return new Continuation(f);
}
}
If above is the case, then would we still see readFile()
and createReadStream()
showing up simultaneously on the stack?
My quick summary of discussion yesterday:
new Continuation(() => {...}, getCurrentContext().readyContext);
Please fill in any thing I missed. :)
tl;dr I believe there is no need to distinguish linking and ready context. I'd be happy to talk more F2F, as I don't know how to write this in a succinct way :)
Continuations depend on abstraction layer
At the diagnostics summit in February we talked a little about how the current continuation might be dependent on the "host". For example, under the covers
fs.readFile
internally reads a file in chunks, but from the surface API we view it as reading a file in one fell swoop. The following visualization shows what the async call graph might look like:(Each row represents a continuation; green represents when the continuation was passed to a continuation point, and blue sections represent execution frames.)
Note that at the marked line, we are actually in two (nested) execution frames; the higher-level execution frame is nested within the lower-level one. This means that any continuation passed to a continuation point right here will have multiple (2) parent continuations. Depending on our use case, we might be more interested in the high-level
fs.readFile
parent continuation, or the lower-levelFSREQWRAP
(grand-)parent continuations.The parent continuations differ trivially because they are both ultimately traced back to the same initial continuation. If we consider two calls to
fs.readFile
on behalf of different "requests":We can see that no matter which parent we follow, we will end up going back to the correct request.
However, if we implement an even higher-level abstraction of
fs.readFile
that, say, only allows one file to be opened at a time, then we cause context confusion. This is because we need to use userspace queueing to queue up file read requests if one is currently happening, and the place from which a queued function might get invoked might not trace back to its corresponding request. So the async call graph might look like this:When we hide low-level
FSREQWRAP
continuations, a problem that arises with thewrapPool
function is easy to visualize:This is a classic example of context confusion: execution after file 2 was read is now being wrongly attributed to request 1. This is because userspace queueing introduces a source of asynchrony that cannot be detected automatically. We need to manually address this source of asynchrony by creating a separate continuation.
The
async_hooks
API presents theAsyncResource
API to do so (AsyncResource
corresponds 1:1 to continuations). Amending the implementation ofwrapPool
by insertingAsyncResource
lifecycle events, we can "fix" the problem and see the following async call graph instead:The difference from before is that a new, higher-level continuation that accounts for the userspace queueing in
wrapPool
now allows us to trace back to request 2 from execution after file 2. Therefore, depending on what level we are concerned with, we might consider the marked linefile 2 opened
to be executing on behalf of request 2 or request 1.Manually adding continuations in userland Promises
Promises represent the only JavaScript API that is implemented with a task queue. This task queue is not exposed at the JavaScript layer, and is the reason why Promises need to be special-cased.
Put another way, a userspace implementation of Promises requires an underlying task queue, because callbacks passed to
then
are not executed synchronously, regardless of whether the Promise has already been resolved. The task queue available in a Node environment is the Node event loop, and enqueueing a task can be done withprocess.nextTick
. This diagram shows how thenextTick
calls (which result in TickObject continuations) would manifest in a userspace implementation:This is reminiscent of the
wrapPool
example shown earlier, as the marked statement is running in multiple continuations, with distinct call lineage tracing back up to request 1 or 2 depending on whether we follow low-levelTickObject
continuations (which correspond toprocess.nextTick
calls that arePromise
implementation details) or the high-levelPROMISE
continuation that corresponds to thethen
continuation point.If we go back to using natively-implemented Promises, there is no reason to remove the continuations associated with calls to
nextTick
. Therefore, it would make sense for there to be two continuations related to Promises -- a “task queue” continuation (PROMISE-MTQ
) and a “then” continuation (PROMISE-THEN
):In summary:
PROMISE-MTQ
is a continuation representing an internal task queue that is created when a Promise is resolved. We enter its execution frame whenever a new task in that queue is run. (The “tasks” are functions that run the callbacks passed to then.)PROMISE-THEN
is a continuation corresponding to thethen
continuation point. We enter its execution frame when the callback passed tothen
is called.PROMISE-THEN
is always nested within thePROMISE-MTQ
.We can map these to linking and ready context concepts:
then
is run, we are in execution frames for two nested continuations:PROMISE-MTQ
andPROMISE-THEN
.PROMISE-MTQ
, the lower level continuation, is the ready parent.PROMISE-THEN
, the higher level continuation, is the linking parent.To extrapolate from this, I believe that distinctions between ready and linking context are not necessary, because they always correspond to lower-level and higher-level continuations respectively. This principle applies to both Promises and userspace queueing implementations.
Demos
kjin/promise-async-call-graph contains samples (including the userspace Promise implementation). To re-create (roughly) some of the async call graph visualizations here: