tc39 / proposal-async-context

Async Context for JavaScript
https://tc39.es/proposal-async-context/
Creative Commons Zero v1.0 Universal
537 stars 12 forks source link

The case for _not_ binding around awaits #83

Open Qard opened 2 months ago

Qard commented 2 months ago

I had a call with @andreubotella the other day explaining some of what APM vendors want from context management that we don't have now, explaining the use cases, how we are limited presently, and how some aspects of the proposal at present somewhat limit us too. One of the main issues discussed was that capturing context when an await happens and then restoring at that point cuts off a branch of the graph which we are interested in.

Some examples:

function readFile(path) {
  // Imagine there's something here to create and store a span for this call scope...

  // We want to produce another span by instrumenting within the openFile function.
  // This span should have a child-of (OpenTelemetry concept) relationship to
  // the readFile span. This is fine as the context will flow into the call.
  const fd = await openFile(path)

  // We want readFd to have a follows-from (OpenTelemetry concept) relationship
  // to the openFile span so we need for that span to be the context value here.
  // This is a problem if we snapshot the context at the await and restore after as
  // the context value would be the readFile span instead of the openFile span.
  const data = await readFd(fd)

  // As with readFd, close should receive the context from the path through readFd,
  // meaning it should have a follows-from relationship to the readFd span.
  await closeFd(fd)

  return data
}

In this scenario we want the execution branches which merge back through an await to also merge back their context. If we instead just relied on context flowing through the promise resolve we would get this automatically and, if no context adjustment was made within openFile, we would still get the same flow expected by binding at awaits.

Another real-world scenario is that we instrument aws-sdk to capture SQS jobs being processed and linking them into a distributed trace. A job processor would look something like this:

const { SQSClient, ReceiveMessageCommand } = require("@aws-sdk/client-sqs")

const client = new SQSClient(clientConfig)
const command = new ReceiveMessageCommand(commandConfig)

async function processor() {
  while (true) {
    // The job received here has distributed tracing headers passed from another service.
    // We want to create a span when client.send(...) receives the result of the command
    // and finds distributed tracing headers. This span should then be stored in the context
    // so subsequent execution can connect to that span.
    const command = new ReceiveMessageCommand(input)
    const response = await client.send(command)

    // Do some processing...
  }
}

In the above example we want each await client.send(...) to flow the span created within out to the async function it's being called within. Subsequent execution within that function until the next await client.send(...) should be considered as continuations of that context. Because of promises flowing context through the resolve path, if this were to be rewritten with promise.then(...) nesting instead of async/await this would behave as expected, which leads me to believe that binding around awaits is actually incorrect. What we should be doing is just letting the context flow into the functions being called and merged back out of whatever the resolution of the awaited promises are. Context is meant to follow execution behaviour, so if execution is merging back from elsewhere so too should context.

If we have a way to access the calling context rather than the bound context then this would be a non-issue, but if we're aiming for a single blessed path then this would make the current design of context flow completely unusable for observability products.

As it is presently, context within an async function is effectively no different from local variables. The stack restores in the same way as the context binding over an await does, so they are functionally the same at that point. The whole point of adding context management to the language is that it can flow values through paths which existing code can not.

andreubotella commented 2 months ago

Because of promises flowing context through the resolve path, if this were to be rewritten with promise.then(...) nesting instead of async/await this would behave as expected, which leads me to believe that binding around awaits is actually incorrect.

That is not correct: in the spec text as it currently stands, calling promise.then() takes a snapshot which is stored in the promise reaction job, and which gets restored whenever that job gets invoked after the promise resolves. This is exactly the same behavior as CPED in V8 has currently.

Qard commented 2 months ago

Ah...well then that is wrong too. πŸ˜…

My understanding from prior conversations here was that we had went for following resolve path rather than promise construction path. We want context to flow from the point a promise is resolved to the point where a continuation from that resolution would run.

Flarna commented 2 months ago

Do you want readFd to be only a follows-from of openFile or additionally a child-of readFile? Similar closeFd could be additionally a child-of readFile.

If both relationships should be kept it's not about choosing a path - both are needed. And potentially even more if one has two similar relationships already inside readFile because of inner readFile operations.

mhofman commented 2 months ago

This is starting to touch on the problem I tried to convey last year: a promise reaction is mainly a join of 2 paths in the execution graph: the resolving path and the subscribing path. This is actually something most telemetry tools I've seen don't have a way to represent (otel supports links, but they are rarely/never displayed this way by otel tools which represent the execution mostly as a tree).

This is particularly apparent with Promise.all which is a join of all the promises resolving paths with a single subscription path.

One could argue there is a 3rd context: the promise creation time, but that is just the point where the execution graph "forked", and as such is usually recorded by any "call" instrumentation. It is also most of the time the same as the registration context.

Both resolving and subscription contexts are necessary for building distributed debuggers allowing to visualize the execution graph. One such example is Causeway.

Qard commented 2 months ago

So this "third" point you mention is more the difference between the scope of the store.run(...) and the point within that where an async task is scheduled. Any async task scheduled directly within the sync execution which occurs within the function given to store.run(...) should capture the context at the point the scheduling occurs. In the case of a promise, the construction of the promise doesn't schedule anything, it's the resolution of the promise which schedules a microtask.

With context management we basically always want to capture where scheduling happens and resume where continuation happens. In a callback-receiving function this is fairly straightforward that we bind the callback to return to the context captured from the original call. It's a bit fuzzier with promises and especially async/await, but we generally want the resolution of a promise to flow the context at that point to the continuation which runs from that resolution. It should flow in that way both with promise.then(...) and also around await promise. The context at promise construction should flow into the code which manages that construction and eventually leads to resolution, but it should not be capturing at that point as that will be skipping over a bunch of internal behaviour we want to be able to flow through.

If we have a way to escape the bound context back to that scheduling path then it doesn't matter if it follows this path, but we do need the tools to get to that path.

jridgewell commented 2 months ago

I want to build an example from your OP code:

const ctx = new AsyncContext.Variable();
function readFile(path) {
  const span = new Span();
  return ctx.run(span, () => {
    const fd = await openFile(path);

    // This is the core ask, that awaiting can change the context.
    assert.notEqual(ctx.get(), span);

    // …
    return data;
  });
}

But this is essentially the same as mutating a context in a child call and expecting it to be present in the parent execution:

const ctx = new AsyncContext.Variable();
function readFile(path) {
  const span = new Span();
  return ctx.run(span, () => {
    const open = openFile(path);

    // Calling `openFile` was allowed to change my context.
    assert.notEqual(ctx.get(), span);

    const fd = await open;

    // …
    return data;
  });
}

I think making this change is a mistake. Rather than allowing openFile to replace the current value stored in ctx, it should mutate the object's properties.

As it is presently, context within an async function is effectively no different from local variables. The stack restores in the same way as the context binding over an await does, so they are functionally the same at that point. The whole point of adding context management to the language is that it can flow values through paths which existing code can not.

This was an important design goal so that we didn't violate dynamic scoping. The same as the discussion we're having in https://github.com/tc39/proposal-using-enforcement/issues/1, I think allowing a child call to replace the context is a mistake.

My understanding from prior conversations here was that we had went for following resolve path rather than promise construction path. We want context to flow from the point a promise is resolved to the point where a continuation from that resolution would run.

reject behaves this way specifically for the execution of unhandledrejection, but it's completely unobservable otherwise. If you promise.then(cb) or promise.catch(cb), the cb will always execute within the context that is active during the promise.then() call.

Under all circumstances, promise.then(()=>console.log(ctx.get())) and await promise; console.log(ctx.get()) must be behave the same.

andreubotella commented 2 months ago

But this is essentially the same as mutating a context in a child call and expecting it to be present in the parent execution:

const ctx = new AsyncContext.Variable();
function readFile(path) {
  const span = new Span();
  return ctx.run(span, () => {
    const open = openFile(path);

    // Calling `openFile` was allowed to change my context.
    assert.notEqual(ctx.get(), span);

    const fd = await open;

    // …
    return data;
  });
}

My understanding of @Qard's proposal is that ctx.get() would not change before the await – instead await open would cause the mapping after the await to be set to the mapping in which open was resolved. The same would go for open.then(() => ctx.get()).

This would not be the function's context changing arbitrarily, it would only happen at await points in async functions. In a sense this would be an equivalent of not restoring the context across yield points in generators.

Jamesernator commented 2 months ago

But this is essentially the same as mutating a context in a child call and expecting it to be present in the parent execution:

I think making this change is a mistake.

It's not just a mistake, it completely breaks the feature for anything other than tracing. For scheduling outer task will suddenly start gaining the priority of tasks that were scheduled (even arbitrarily deeply) inside other tasks, for things like dependency injection this would mean inner injections would bleed out to the upper context.

In order to repair this, you would have to wrap every await with a thing that restores the intended context, i.e. for scheduling you'd have to do:

await scheduler.runSubTask(...);

on every await everywhere. This is exactly the thing AsyncContext was meant to avoid.


I don't see that any solution for the tracing use case other than just providing AsyncContext.callingContext(), although as @mhofman points out, with Promise.all a single calling context doesn't make sense, probably just returning an array of contexts is enough, i.e. AsyncContext.callingContexts().

Something I am wary though of the fact that any form of AsyncContext.callingContexts() or similar significantly extends the lifetime of any snapshotted variables even when those variables may wish to be cleared as early as possible.

Like for termination tracking, termination can't happen until the parent context ends as well:

const task = await scheduler.measureTask(theTask, (totalTime) => {
    console.log(`Took: `, totalTime);
});

// This following code is *still* within the measurement lifetime because
// AsyncContext.callingContexts() could possibly be called
for (let i = 0; i < 100000000; i+=1) {
   doHeavyComputation();
}
Jamesernator commented 2 months ago

Perhaps there should be second kind of variable (or option to AsyncContext.Variable) specifically designed for tracing-like use cases. (Regular variables would not hold onto their calling context). i.e.:

const tracingVariable = new AsyncContext.TracingVariable();

tracingVariable.run("call", async function() {
    console.log(tracingVariable.get()); // "call"
    console.log(tracingVariable.callingContexts()); // []
    await Promise.all([
        new Promise((resolve) => {
            tracingVariable.run("promise-1", () => {
                resolve();
            });
        }),
        new Promise((reject) => {
            tracingVariable.run("promise-2", () => {
                resolve();
            });
        }),
    ]);
    console.log(tracingVariable.get()); // "call"
    console.log(tracingVariable.callingContexts()); // ["promise-1", "promise-2"]
});
Qard commented 2 months ago

My understanding of @Qard's proposal is that ctx.get() would not change before the await – instead await open would cause the mapping after the await to be set to the mapping in which open was resolved. The same would go for open.then(() => ctx.get()).

Yes, this is what I mean. Context should definitely not leak out of a context scope, but it should flow through a continuation path, which means that if a promise resolves with a different context then that different context should be what comes out on the other side of the await. If no context change happened before it reached the resolve than it would use the same context as before without needing to bind around the await, it would just gain the additional capability of flowing context values through continuation paths.

Context is meant to model execution flow, so when execution flow is merging back then the context flow should merge back too.

Qard commented 2 months ago

In order to repair this, you would have to wrap every await with a thing that restores the intended context, i.e. for scheduling you'd have to do:

await scheduler.runSubTask(...);

This is exactly what should be happening. If you want to orphan a sub-graph by binding around it then it probably should be explicit.

I think generally speaking though that context is meant to flow to logically continuing execution. In callback-based code a call which schedules an async task would capture a snapshot and restore that snapshot around the execution of the callback. The exact same behaviour should be expected of a promise, both with p.then(...) chaining and with async/await.

It's important that the innermost path be followed and that parts of the graph do not get stepped over when propagating the context. It's acceptable for context values to flow to a point and then have a bind occur to swap that value out, but that context should flow to that point and should be primarily propagating around the points where tasks are scheduled not where continuations like callbacks are received as the receipt point often has further things within which need context for tracing, but also there could be multiple points at which tasks could be "scheduled" for a thing which is a single operation from the perspective of making a call, so each scheduling should be capturing the context at its appropriate point to be correct.

Jamesernator commented 2 months ago

This is exactly what should be happening.

No, this is completely unusable for all the non-tracing use cases, the whole point of AsyncContext is that users don't want to thread things through many layers of functions. If you have to wrap anyway there is little value in this proposal for any of the other use cases.

It's important that the innermost path be followed and that parts of the graph do not get stepped over when propagating the context

No, it's only important for the tracing use case, for all other use cases this is an anti-feature.

Jamesernator commented 2 months ago

Just to give an example of why the proposed behaviour is completely unusable, consider for example the dependency injection use case:

// In practice the logger would be a wrapper around this, but this is fine for the example
const logger = new AsyncContext.Variable({ default: consoleLogger });

async function subTask() {
    // At the end of the subtask
    return await logger.run(fileLogger("subTask.log", async function() {
        logger.get().print("In the subtask");
    });
}

await logger.run(fileLogger("main.log"), async function() {
   logger.get().print("Starting the subtask");
   await subTask();
   logger.get().print("Ending the subtask");
});

In this example, your proposed behaviour would have "Ending the subtask" be printed to the subTask.log file, that is absolutely not the behaviour one would want for dependency injection use cases.

Qard commented 2 months ago

the whole point of AsyncContext is that users don't want to thread things through many layers of functions

Yes, I am in agreement with this, and I do think that the ideal state is that context has a most reasonable default flow which avoids needing these manual binds. But there is no path which is universal, even excluding the tracing cases. We need to have mechanisms of getting the context values from other execution paths. A default can and should be chosen in the general case, but we need flexibility or the feature will be useless even for the basic application developer use case as there will be many cases where it is needed to provide some manual signals about path preference.

No, it's only important for the tracing use case, for all other use cases this is an anti-feature.

This is absolutely false. Flowing through the inner paths are required for correct functioning or users will most certainly try to access context in internal flows of complex interactions they are creating and have it not work the way they expected because we didn't flow the context through all execution. For example, say you have a database client connection object and make a query call which takes a callback, but as part of the connection you have some session management step that can be defined as another function, or perhaps an event--if you don't flow through the internal path to where that session management step is executed and instead step over from call to callback the user could try to read the context expecting a value to be present because it logically flows from the call and be frustrated and confused when they instead get an empty context.

Context can not function correctly unless the innermost path is followed. We (APMs) have done extensive research on this subject to prove that this is the case and I have a whole bunch of documentation on this subject which I will be sharing soon when I have had the time to clean it up.

In this example, your proposed behaviour would have "Ending the subtask" be printed to the subTask.log file, that is absolutely not the behaviour one would want for dependency injection use cases.

That actually usually has been the expected behaviour with AsyncLocalStorage in Node.js. It currently doesn't function this way (due to limitations of the implementation) and we have received many bug reports over the years from people expecting for context to flow to logically continuing code. Certainly in this specific example it can be a bit more ambiguous which flow is correct, and this is why I keep pushing that we need control of context flow. I will repeat this as many times as is necessary for people to finally acknowledge it: There is no single perfect context flow, we need per-store context flowing capabilities, and in the most user-friendly way possible.

Jamesernator commented 2 months ago

A default can and should be chosen in the general case, but we need flexibility or the feature will be useless even for the basic application developer use case as there will be many cases where it is needed to provide some manual signals about path preference

I agree that there should be a way to get other contexts, but I completely believe that the current design is the only coherent behaviour for await.

There is no way to keep the registration, i.e. the await context, without boilerplate on every single await. However the calling context can be accomplished merely by providing a function AsyncContext.callingContexts().

(Users are typically not going to the ones calling these functions, so the fact AsyncContext.callingContexts() adds boilerplate isn't really that important, that'll be hidden behind methods like .trace() or such).

That actually usually has been the expected behaviour with AsyncLocalStorage in Node.js. It currently doesn't function this way (due to limitations of the implementation) and we have received many bug reports over the years from people expecting for context to flow to logically continuing code.

Developers often want magic that isn't coherent beyond a narrow particular usage. In particular we return to the fact that there is arbitrarily many possible contexts when Promise.all is used.

Also from looking at the issues on AsyncLocalStorage, many of them seem to be about .enterWith, which is unsurprising that people would expect this flow upwards as there is no corresponding .exit.

For example, say you have a database client connection object and make a query call which takes a callback, but as part of the connection you have some session management step that can be defined as another function, or perhaps an event--if you don't flow through the internal path to where that session management step is executed and instead step over from call to callback the user could try to read the context expecting a value to be present because it logically flows from the call and be frustrated and confused when they instead get an empty context.

This is yet another case that needs to be made coherent under the existence of many contexts (i.e. Promise.all again). If the database and connection are cooperating then they can look through all relevant contexts with AsyncContext.callingContexts() (or similar) and the current context and pick whichever one is most relevant.

Qard commented 2 months ago

There is no way to keep the registration, i.e. the await context, without boilerplate on every single await. However the calling context can be accomplished merely by providing a function AsyncContext.callingContexts().

Yes, so I'm saying as long as that capability exists then our needs are satisfied. However, I'm not seeing how exactly one could do that in a way which could result in having the different context after the await, even with a wrapper. If the await expression is ignoring whatever it receives from the promise it is yielding the result of then how would wrapping that result be any sort of solution? At the least, I think we may need to specify some additional machinery to ensure the promise continuation context can actually be available somehow, because as it is now I don't see how it would be. πŸ€”

Developers often want magic that isn't coherent beyond a narrow particular usage. In particular we return to the fact that there is arbitrarily many possible contexts when Promise.all is used.

Yes, and that is something we want to be able to flow through. The final task should be the default value, but we should have the tools to get the other paths to express a trace graph where execution branched and merged.

Also from looking at the issues on AsyncLocalStorage, many of them seem to be about .enterWith, which is unsurprising that people would expect this flow upwards as there is no corresponding .exit.

Another enterWith is the corresponding exit, same as how the AsyncContext spec uses AsyncContextSwap.

This is yet another case that needs to be made coherent under the existence of many contexts (i.e. Promise.all again). If the database and connection are cooperating then they can look through all relevant contexts with AsyncContext.callingContexts() (or similar) and the current context and pick whichever one is most relevant.

Yes, I agree with this. This is another case where tools to get the other contexts are necessary. As I said earlier, I don't think the default behaviour necessarily needs to change so long as we have a way to reach the other relevant context paths. APMs are more than happy to do some additional work to make sure we're getting the right contexts at the right time, and I'm working on a Universal Diagnostics Channel proposal to manage this connection with context management. I'll be working on the docs for that next week so hopefully can share that by the end of the month.

andreubotella commented 2 months ago

Do APMs need to be able to access the contexts for all of the promises resolved with Promise.all? If so, would this be special-cased to Promise.all, or are there userland APIs that logically join multiple contexts that would also need this capability? What about multiple originating sources for a web API event, like the <video> case I mention in #82?

Edit: Also, how would multiple calling contexts flow through the data flow graph?

jridgewell commented 2 months ago

My understanding of @Qard's proposal is that ctx.get() would not change before the await – instead await open would cause the mapping after the await to be set to the mapping in which open was resolved. The same would go for open.then(() => ctx.get()).

My point is that this is await new Promise(r => ctx.run(2, r)) is effectively ctx.set(2), just with an async tick. The leak still exists from the thing that resolves to promise, and it infects everyone who awaits the promise returned by my async function. It doesn't affect my sync call stack the way ctx.set(2) does, so I guess it's half a win.

Yes, so I'm saying as long as that capability exists then our needs are satisfied. However, I'm not seeing how exactly one could do that in a way which could result in having the different context after the await, even with a wrapper. If the await expression is ignoring whatever it receives from the promise it is yielding the result of then how would wrapping that result be any sort of solution? At the least, I think we may need to specify some additional machinery to ensure the promise continuation context can actually be available somehow, because as it is now I don't see how it would be. πŸ€”

Why isn't it possible to cooperatively mutate the shared object in your examples?

I think the way that you're approaching this is backwards. Learning context variables as if they are function params is exactly the right mental model so that anyone could understand them. Once you enter the function body, the param references and context is set. You can mutate a shared object, but you can't replace the object value itself.

Qard commented 2 months ago

Do APMs need to be able to access the contexts for all of the promises resolved with Promise.all?

Honestly, I personally view multi-context merges as a bit of a stretch goal. But to implement OpenTelemetry correctly, according to its own spec, it does indeed require all contexts from a multi-branch merge as it is supposed to include follows-from links to each.

Why isn't it possible to cooperatively mutate the shared object in your examples?

I'm not sure I understand what you are asking. What do you mean by "cooperatively mutate"?

I think the way that you're approaching this is backwards. Learning context variables as if they are function params is exactly the right mental model so that anyone could understand them. Once you enter the function body, the param references and context is set. You can mutate a shared object, but you can't replace the object value itself.

Trying to map context to existing language constructs is a mistake. It doesn't fit that model. If it did, we wouldn't need another model.

Context is meant to follow intent of execution, and there are many cases where the "flow" of intended execution is pull-based such as my aws-sdk example. In cases such as these the context needs to flow back along merge paths just as the data flows. From the user perspective, not having the task pull take the context defined within the function flow into the continuation would be a failure of context propagation.

Now perhaps the specific expression of how I have described this problem so far has been insufficient or imperfect from a usability perspective. But I'm trying to make clear that this form of data flow is a very real user expectation that by not supporting would be a significant source of confusion to many users.

To put it as plainly as I can: context is a reflection of the flow of data within an application.

Jamesernator commented 2 months ago

Honestly, I personally view multi-context merges as a bit of a stretch goal.

This is a core language feature, the design likely can't be changed later so getting it right the first time is a must.

Trying to map context to existing language constructs is a mistake. It doesn't fit that model. If it did, we wouldn't need another model.

The exact same reasoning can be applied to the alternate proposal here, having values flow out of inner calls is exactly the same model as if the things had just returned the associated values. Like the example in the explainer could just be:

const { headers } = await client.send(command);
// use headers somewhere

Neither approach unlocks anything fundamental that can't be done today.

jridgewell commented 2 months ago

I'm not sure I understand what you are asking. What do you mean by "cooperatively mutate"?

Have the functions mutate the context object. Ie, there's a follows-from and child-of tree relationship for readFile, why isn't it doing activeSpan.run(readFileSpan, () => { openFile(); …; readFile(); …; closeFile() }) and each of the child functions pushing their own span onto the readFileSpan object?

Trying to map context to existing language constructs is a mistake. It doesn't fit that model. If it did, we wouldn't need another model… But I'm trying to make clear that this form of data flow is a very real user expectation that by not supporting would be a significant source of confusion to many users.

I completely disagree with this. AsyncContext is a reflection of the lexical call scoping, the ability to place implicitly data on the call stack. If you want to reason about what data is present in a variable, look at the call stack above you to see who set the data last. It's the way I reason about it, it's the way I've explained it to the committee, and it's the way the ecosystem has implemented it.

I think breaking from this would be the source of confusion. If you want additional data out of a promise, then the promise needs to return that data.

imperfect from a usability perspective.

It breaks each of the README examples, and anything that would make use of nested scoping.

Qard commented 2 months ago

Have the functions mutate the context object. Ie, there's a follows-from and child-of tree relationship for readFile, why isn't it doing activeSpan.run(readFileSpan, () => { openFile(); …; readFile(); …; closeFile() }) and each of the child functions pushing their own span onto the readFileSpan object?

That's how we create memory leaks. That is exactly what almost every APM does now and every one of them suffers severely from memory problems as a result. They're stuffing all their data into span objects and then stuffing each into one continuous trace object which they hold for the entire life of the request rather than storing only the specific bits of data needed for the given path. We shouldn't even be storing spans, we should be storing correlation IDs to minimize risk of things being held alive unexpectedly.

If everything is getting dumped into a readFileSpan object, how do we know when it's not needed anymore and can be cleaned up? What if parts are needed for very different lengths of time? In this specific example it's relatively straightforward, but in cases like streams, timers, or any other thing which can be called multiple times it gets extremely unclear.

Context is a thing we have always been able to do in userland code, it was just terrible because of unreliable flow and cleanup. This is why we pushed for including it in the Node.js runtime, and now the language, so the VM can actually have full control over data lifetimes and expire things at appropriate times. But if we're not going to take advantage of that then I feel like we're missing the point of putting it in the language.

What we want from context is to be able to progressively define points in the execution graph alongside the execution and flow through those points as execution flows. We specifically do not want to be holding that whole graph in-memory. The point of context is that the runtime manages what flows where and until when.

I completely disagree with this.

You can disagree all you want, but you're ignoring the feedback of the person maintaining the single most used context management system (AsyncLocalStorage) currently existing in JS.

The existing tools sort of solve some of the needs of context management, but they are all incomplete and frequently have context breakage issues because they simply lack a fully-formed model. These existing systems have all needed to exist in this way because the more advanced needs are simply not possible without runtime assistance or tremendous complexity and overhead to achieve externally.

I have provided what seems to me like a very clear case where user expectation would be that context does flow through the promise and out the await. Now I'm not saying it should always work in that way as it seems like you have some use case in mind for it to work the opposite way, which I have not seen explained so far, and I do in fact have some use for that myself, but I am saying we need to have some way to access context from that internal path for the API to be useful in many very common cases.

Without a way to get internal paths like this we are standardizing broken context management. The fact that what we have now is broken should not be taken as validation of the status quo.

You're talking to someone whose core focus of their entire career for near a decade has been figuring out how to make context management work. I was hired by Datadog in the first place to fix performance and reliability issues with prior context management technologies and to define what those could look like in the future, and before that it was a big part of my work at Elastic and AppNeta previously. I had my part in building the core parts of what became AsyncLocalStorage, but we had limited ability to make it work the way it actually needs to work, partly because we had not yet learned many of these behaviours I am talking about now, and partly because it was just too difficult to do outside of the runtime.

I'm trying to engage here to ensure that what we land on with AsyncContext is a good API and not just pushing through something half-baked, but I'm feeling that my expertise in this space is being ignored and my examples dismissed as edge cases when I've seen quite extensively that these scenarios occur all the time in real-world code. I'll be the first to question if edge case behaviours are worth additional complexity, but these are not edge cases at all, these are actually quite common patterns for which we would be breaking.

andreubotella commented 2 months ago

One question I have is, if this is such a problem for the tracing use case, why wasn't it caught earlier?

The behavior across async continuations in the proposed spec text is the same as for ALS and CPED, and my understanding (and apparently the understanding of everyone in yesterday's AsyncContext meeting) was that everyone was clear on that. The examples in the readme also show that this was the intended behavior. And as far as I'm aware, before this issue, no one had raised the model you're proposing.

Yet you seemed to assume that your model was the way AsyncContext would work, and not only that, but you framed the OP as if the proposal had that model for promise.then but it was getting it wrong for await – as if this was an obvious bug we hadn't caught, rather than a completely different model than the proposal had. So how did this misunderstanding come about?

Qard commented 2 months ago

It was my understanding that we were following the resolve path. That was an error on my part. There was a bunch of discussion previously, particularly from James Snell, that suggested the resolve path was the more correct path and it seemed to me at least at the time that people were in agreement on that.

I have only had the time to fairly passively follow the progress of the proposal until recently, so I definitely missed a lot of details and conversations. I have not been invited to any of the meetings, so I've definitely missed all those conversations, and I only joined the Matrix chat somewhat recently as I was unaware there was a channel there for this.

I have unfortunately not had the time to invest too deeply in the AsyncContext discussion until more recently, which is why I had not brought this up before. I was under the impression it was still very early in development yet more recently I've had several people reach out to me telling me it's pushing rapidly toward Stage 3 and asking me to provide my own input, thus I started to dig in to make sure the model made sense before it gets too far along to propose changes.

And no, I was not assuming it worked the way I'm describing, I didn't yet know all the details of how it worked. I knew of the general object model, but had not taken the time yet to look really closely at the details of how the flow works.

As for the multi-path thing, that's something I'm only more recently trying to design solutions for (I have a bunch of docs to share on that soon) so I wasn't yet sure if that was a thing which could be solved reasonably in only the single-path way. My original intuition was that it could be, but only if we followed the internal path by default and allowed binding around it. But in more recent iteration on the ideas in my own experimentation recently I'm seeing that we can have a default path that follows user behaviour as much as possible, but we need some additional tools to be able to achieve that, like having a way to escape a bind to the old path (calling context) or having a way to flow through awaits (not yet sure what that should look like) and probably other cases.

What I'm trying to do here is just to shine a light on some areas which I don't feel are sufficiently solved yet so we don't push to the next stage too quickly without consideration for how to handle those cases.

If AsyncContext was to land in a state that it is unable to satisfy the majority of use cases then different systems will surely continue to exist and in particular Observability products may have to continue doing things their own way, which I feel would be a major failure as that will continue to leave a major performance impact on the table without any resolution.

jridgewell commented 2 months ago

That's how we create memory leaks. That is exactly what almost every APM does now and every one of them suffers severely from memory problems as a result. They're stuffing all their data into span objects and then stuffing each into one continuous trace object which they hold for the entire life of the request rather than storing only the specific bits of data needed for the given path.

Can you explain the memory leak here? As far as I understand, both my proposed mutations and the resolve-time continuations would keep the parent chain alive. Especially if you're building a follows-from and child-of relationship, wouldn't the resolve path keep those spans alive?

We shouldn't even be storing spans, we should be storing correlation IDs to minimize risk of things being held alive unexpectedly.

That seems like it could apply with either decision. I assume we'd still need a parent ID so that we can build the relationship after the current execution is completed?

In which case, my mutations suggestion would to store a new active ID in readFile, and each of the child functions would build their own ID derived from that active ID.

This is why we pushed for including it in the Node.js runtime, and now the language, so the VM can actually have full control over data lifetimes and expire things at appropriate times. But if we're not going to take advantage of that then I feel like we're missing the point of putting it in the language.

My reason for proposing this in the language was to use ALS in a web environment, with considerably better performance. I feel like the proposal is successful at that, and so I'm hesitant to break with precedent.

You can disagree all you want, but you're ignoring the feedback of the person maintaining the single most used context management system (AsyncLocalStorage) currently existing in JS.

It's not my intention to ignore problem, and I've proposed how I would solve the OP with the current semantics. I'm interested in figuring out why that approach doesn't work. I need more examples to understand why the memory leak exists in one situation and not the other.

My disagreement here is specifically the "by not supporting would be a significant source of confusion to many users" statement. The way that I reason about AC (and everyone in the meeting yesterday), and the way I've explained it to the committee is with the current semantics. Specifically, building from a faulty SyncContext (not a typo) example (slides 2-14). The fact that so many of us have this understanding, and that it matches CPED and ALS's behaviors, reenforces that this is the expected behavior. If we break from it, that would be a source of confusion. We would need a very good reason to do it.

Now I'm not saying it should always work in that way as it seems like you have some use case in mind for it to work the opposite way, which I have not seen explained so far, and I do in fact have some use for that myself, but I am saying we need to have some way to access context from that internal path for the API to be useful in many very common cases.

From the meeting yesterday, I think we're interested in exploring this. You, @mhofman, and Shay (I don't know their GH handle) have raised the need to receive a promise's resolving context in some senarios. This seems analogous to callingContext(), and if we can an API that solves this use case then we should pursue that.

You're talking to someone whose core focus of their entire career for near a decade has been figuring out how to make context management work… but I'm feeling that my expertise in this space is being ignored and my examples dismissed as edge cases when I've seen quite extensively that these scenarios occur all the time in real-world code

I hope I haven't made you feel like I'm questioning your expertise here. I completely recognize that you have a better understanding of the problem space.

I think this is the only proposal of yours that I disagree with, and I'm approaching from the standpoint of how to explain the changes to the committee on why we're breaking precedent, and does it break the use cases that I have in mind in the README examples.

I have not been invited to any of the meetings, so I've definitely missed all those conversations, and I only joined the Matrix chat somewhat recently as I was unaware there was a channel there for this.

This is a mistake on our part, I thought you were include in the invite list. Please message me on Matrix with your email!

As for the multi-path thing, that's something I'm only more recently trying to design solutions for (I have a bunch of docs to share on that soon) so I wasn't yet sure if that was a thing which could be solved reasonably in only the single-path way.

The docs here would be very useful. We're currently trying to reach Stage 2.7 (signal to implementers that they should begin implementing), and the last remaining piece is defining the expected HTML integration. I think based on this discussion and the callingContext() discussion, we still have an unaddressed use case that we should figure out before reaching Stage 3.

Jamesernator commented 2 months ago

If AsyncContext was to land in a state that it is unable to satisfy the majority of use cases then different systems will surely continue to exist

You keep stating this in various ways, but haven't provided compelling points that this change is good for any of the use cases other than tracing.

Just to be clear I do think everyone here believes that for tracing, the current design has problems, but introducing problems (like your own suggestion that guards should wrap every await to avoid inner paths poisoning your current path) for things like prioritized scheduling this is just terrible and is in fact worse than the status quo given that unlike passing parameters, it's like if those parameters were mutable and changed their value at the call site.

Something you keep stating that developers expect context to flow out, but you also don't really address why this shouldn't apply equally so to synchronous code and why await should have magical powers here.

Like in your example:

    // The job received here has distributed tracing headers passed from another service.
    // We want to create a span when client.send(...) receives the result of the command
    // and finds distributed tracing headers. This span should then be stored in the context
    // so subsequent execution can connect to that span.
    const command = new ReceiveMessageCommand(input)
    const response = await client.send(command)

    // Do some processing...

the fact that await is used here should not matter, the headers should be preserved even in the synchronous case.

Further you still need to address how Promise.all (and co) work, like in your example of:

For example, say you have a database client connection object and make a query call which takes a callback, but as part of the connection you have some session management step that can be defined as another function, or perhaps an event

what happens if we have:

await Promise.all([
    oneTaskThatStartsASession(),
    anotherTaskThatStartsASession(),
]);
// What is the session here?

does this create two sessions and one arbitrarily wins? Are all sessions available as an array? Do the two tasks somehow coordinate to avoid creating redundant sessions, how

That's how we create memory leaks.

Your suggested change here produces the same problem but the other way around, now if you set a value it leaks out:

const asyncVar = new AsyncContext.Variable();

async function foo() {
    return asyncVar.run(someLargeThing, () => {

    });
}

// The large thing has leaked into the outer scope
await foo();

Now this thread could go in circles forever arguing one approach over the other, however I think the real solution is just to have two kinds of "async variables".

I would like to explain what I think is really the solution here that works for all use cases, but first lets revisit SyncContext and function calls.

As everyone probably knows here, when function calls occur it's parameters are essentially pushed on the call stack and the function is given access to that part of the start which it sees as parameters. However this behaviour isn't fundamental in the synchronous case, one could simulate parameters by simply having your own stack on objects:

class Param {
    #stack = [];

    run(value, fn) {
        try {
            this.#stack.push(value);
            return fn(); 
        } finally {
            this.#stack.pop();
        }
    }

    get() {
        return this.#stack.at(-1);
    }
}

const params = new Param();

function add() {
    const [x, y] = params.get();
    return x + y;
}

const v = params.run([1, 2], add);

This explains, however call stacks also deal with returns (and exceptions) which are similar to parameters but they live on the parent part of the stack.

Like in this example:

function foo() {
    const v = add(3, 4);
}

the parameters 3, 4 are pushed onto the call stack at their relevant locations. However the return value is essentially the location of v. The time for v to be popped is decided by foo, this is the opposite to parameters, x, y are popped by add.

Now what is being called SyncContext is essentially synchronous parameters, what is currently AsyncContext.Variable is essentially the asynchronous counterpart of that. For transitive schedulers, dependency injection this is what they would use.

However it's clear tracing (and some other use cases I'm sure) needs the asynchronous equivalent of returns, i.e. the value is received from the resolve context (i.e. the return context).

Some use cases probably even require mixes between asynchronous parameters and asynchronous returns in which case they would need both behaviours.

As such as a specific proposal I'd suggest we simply have a second kind of variable which behaves in the way like return does:

// With existing variable being renamed to ParameterVariable
const v = new AsyncContext.ReturnVariable();

async function openFile() {
    v.set("openFile");
    return ...;
}

async function foo() {
    v.set("foo");
}

async function bar() {
    v.set("bar");
}

async function main() {
    const fd = await openFile(...);
    v.get(); // openFile

    await Promise.all([foo(), bar()]);
    v.get(); // ["foo", "bar"]
}

In particular the async context mapping would consist of two parts, the parameters context (same as today) and the return context(s), when resuming an await the parameter context would be restored and the return context set.

The snapshots should probably even be split as well, i.e. you could capture the return and parameter snapshots separately:

class EventsLike {
    #listeners = [];

    addListener(listener) {
        this.#listeners.push({
            listener,
            parameterCtx: new AsyncContext.ParameterSnapshot(),
        });
    }

    notify(v) {
        for (const { listener, parameterCtx } of this.#listeners) {
            const returnCtx = new AsyncContext.ReturnSnapshot();
            queueMicrotask(() => {
                AsyncContext.run(listener, parameterCtx, returnCtx);
            });
        }
    }
}

const e = new EventsLike();

const p = new AsyncContext.ParameterVariable();
const r = new AsyncContext.ReturnVariable();

p.run("init", () => {
    e.addListener(() => {
        console.log(p.get()); // init
        console.log(r.get()); // notify
    });
});

r.run("notify", () => e.notify());

In terms of an API shape (and naming) I don't know what the best option here is, we could either have partitioned APIs:

namespace AsyncContext {
    export class ParameterVariable<T> { ... }

    export class ReturnVariable<T> { ... }

    export class ParameterSnapshot<T> { ... }

    export class ReturnSnapshot<T> { ... }
}

or we could have options:

namespace AsyncContext {
    export class Variable<T> {
        constructor(opts: { kind: "parameter" | "return", ... });
    }

    export class Snapshot<T> {
        // The return-only part of the snapshot
        get returnSnapshot(): Snapshot<T> { ... }
        // The parameters-only part of the snapshot
        get parametersSnapshot(): Snapshot<T> { ... }
    }
}
Qard commented 2 months ago

@jridgewell

As far as I understand, both my proposed mutations and the resolve-time continuations would keep the parent chain alive. Especially if you're building a follows-from and child-of relationship, wouldn't the resolve path keep those spans alive?

Yes, both keep the data alive, but pushing spans into other spans keeps more data alive, which significantly increases memory pressure. It's not a "leak" in the traditional sense, but users always call it a leak because requests suddenly take several times more memory space than usual. Observability, by design, captures data everywhere and so that adds up quickly unless we're very careful about flowing the minimum amount of data possible. This is especially a concern when dealing with highly asynchronous environments like Node.js--10k concurrent requests means 10k concurrent traces, each with potentially 1k+ spans.

That seems like it could apply with either decision. I assume we'd still need a parent ID so that we can build the relationship after the current execution is completed?

We need to hold only the ID representing the "span" data itself until we reach the continuation (callback, promise resolution, etc.) at which point we send an event with that ID marking the end of that span and then any new span created within that continuation or any nested execution would consider that its parent and swap the value for its part of the async execution tree. So you actually don't need separate stores for current and parent IDs, the current always becomes the parent when a switch would happen. We want to send the data out of the process immediately when a span ends, not hold it in an in-memory graph, because it's too expensive to store an in-memory graph.

My reason for proposing this in the language was to use ALS in a web environment, with considerably better performance. I feel like the proposal is successful at that, and so I'm hesitant to break with precedent.

Yes, I agree that it achieves mostly the same as what ALS does now, but with better performance. However, you only benefit from that better performance if you don't need to go to significant lengths to work around it not flowing in a path that is useful to your use case, which is why I believe we need more tools to be able to influence the flow. This is why I proposed instance-scoped bind previously, and why I was supportive of the calling context idea, because those can be built with pretty good performance at the runtime level, whereas building them externally is very expensive yet we quite often need those capabilities. Control over data flow and lifetimes is a big part of how performance can be achieved in these scenarios, thus my skepticism that there's actually much benefit putting this in the language if it doesn't actually enable us to gain from these supposed advantages.

I've proposed how I would solve the OP with the current semantics. I'm interested in figuring out why that approach doesn't work. I need more examples to understand why the memory leak exists in one situation and not the other.

As I stated above, your solution involves holding a lot of data to build the graph in-memory. This is too expensive. It's not uncommon for an app to have 10k-50k concurrent requests, which means 10k-50k traces, which would typically amount to about 10m+ spans. This typically amounts to 1gb+ of memory being occupied by spans, which makes users unhappy about the overhead.

We can trim this down somewhat and store reduced representations, which is essentially just an act in pursuit of just tracking IDs, but we still have not dealt with the cpu cost of working around that the context flow doesn't deliver what our users expect and so we need to do additional work to hack around these expectations. This is work APM vendors are doing right now on top of ALS because it doesn't deliver the exact model users are expecting in some cases, so we are currently trading overhead for correctness from our users perspectives. If we had a model with more flexibility we would not have this problem.

My disagreement here is specifically the "by not supporting would be a significant source of confusion to many users" statement. The way that I reason about AC (and everyone in the meeting yesterday), and the way I've explained it to the committee is with the current semantics.

Yes, I am aware that is how it has been communicated so far. But, to be frank, there is zero representation in the committee from people building production observability tools, so you all simply lack awareness of the more advanced needs. That's fine, that's why I'm here trying to share my experience on that topic and to get people thinking about solutions for these more demanding needs. Also, as I expressed before, I don't disagree with having a default path that is the most obvious to the average user. What I disagree with is dismissing the other perfectly valid paths entirely by not considering tools to enable other such paths. There are very valid use cases for different flows and we should be considering how we can enable them, even if it involves a little extra effort to reach those other paths.

From the meeting yesterday, I think we're interested in exploring this. You, @mhofman, and Shay (I don't know their GH handle) have raised the need to receive a promise's resolving context in some senarios. This seems analogous to callingContext(), and if we can an API that solves this use case then we should pursue that.

Yes, agreed that we want something like that, though I think the callingContext() proposal is the opposite side from what we would generally want--we don't want to change the global flow, in fact we don't necessarily even want to discard the "normal" flow at all, we just want to be able to read from both paths.

I'm approaching from the standpoint of how to explain the changes to the committee on why we're breaking precedent, and does it break the use cases that I have in mind in the README examples.

I don't view it as a break so much as an addition. Sorry if I didn't communicate the original message well but I'm definitely not trying to say flowing out of the await is the only valid interpretation, just that it is a valid interpretation and there are definite scenarios where users would expect that so I think there needs to be ways to express both paths.

This is a mistake on our part, I thought you were include in the invite list. Please message me on Matrix with your email!

Not that I've ever seen, no. I'll message you about on Matrix soon. πŸ™‚

We're currently trying to reach Stage 2.7 (signal to implementers that they should begin implementing), and the last remaining piece is defining the expected HTML integration. I think based on this discussion and the callingContext() discussion, we still have an unaddressed use case that we should figure out before reaching Stage 3.

Yes, agreed. And I don't think it's necessarily a significant change, but I want to share my docs soon (hopefully end of this week) which will explain some of our own ideation and our reasoning for why we think things should be a bit different.


@Jamesernator

Something you keep stating that developers expect context to flow out, but you also don't really address why this shouldn't apply equally so to synchronous code and why await should have magical powers here.

Because nothing can happen between one piece of sync code and another directly following it, so we can just use global variables at that point. With async code there can be interleaved execution between logically continuing steps so we need a tool to be able to retain data across to those continuations. ALS and other JS-based context management systems are only build to follow one direction of the application data, but other languages which have context management implementations often follow both directions.

the fact that await is used here should not matter, the headers should be preserved even in the synchronous case.

It's not preserving of the headers that matters, it's extracting of the headers to produce a span in which operation on the task occurs. To do so requires being able to flow the span out of the await as that's the only valid place to transform those headers into a span as the rest is user code which is unknowable.

Further you still need to address how Promise.all (and co) work

I don't yet have fully-formed intuition on exactly what that should look like, but OpenTelemetry does specify that we should should link all of those branches as follow-from when a span is created in the continuation after the Promise.all(...) resolution. There could be some sort of special-case aggregate context in that case of Promise.all(...). It follows the reverse path which our current context does not flow through though, so it could be we just specific a different, more manual method for flowing the other way.

I don't have all the answers here, but I do have a collection of ideas which have been discussed within the tracing space. For example, one that has come up is a fork and join model where scheduling of a promise forks the context and await joins it back into another context.

Your suggested change here produces the same problem but the other way around, now if you set a value it leaks out

Yes, that's kind of intentional, actually. Everything after the await is a continuation of that promise resolving, so it is logically descending in some sense. If you wrote the same code with callbacks it would be obvious that the context should flow into it as it would be a callback passed into foo and executed within it.

Now this thread could go in circles forever arguing one approach over the other, however I think the real solution is just to have two kinds of "async variables".

I partly agree, though I'm not entirely convinced it should be two different variables, I have been leaning to two different flows expressed within the same variable. I'm not firm on that though. We just need something to express the other flow.

However it's clear tracing (and some other use cases I'm sure) needs the asynchronous equivalent of returns, i.e. the value is received from the resolve context (i.e. the return context).

Yes, exactly. To model the "follows-from" graph which OpenTelemetry describes we need to be able to be able to retain the parent in a return flow graph when promises resolve to where the continuation does more work which could produce a new span.

Some use cases probably even require mixes between asynchronous parameters and asynchronous returns in which case they would need both behaviours.

Yep, so this is where I'm unsure that a two-variable model would work. There are cases where the logical flow is going inward into the calls and terminating, but there are cases such as my job processor example in the original post where it's also logical that the context flows out to the parent where it was awaited. Perhaps some sort of indicator on the part of that inner run that it should flow out? Like what if we had something like:

async function foo() {
  return store.runAndReturn('foo', () => {
    return Promise.resolve()
  })
}

await foo()
// Should be 'foo' after because we've explicitly stated the context should flow out of the return

It's also worth noting that, at least in the case of tracing, I think we'd almost always want promises to flow through this return path.

andreubotella commented 2 months ago

@Qard You mentioned (in private communication, I think) that you have been looking at context propagation across multiple languages. So is there any language or framework that has something like async/await and has a context propagation management that matches the behavior you're describing, so we can look more in depth into it and use it as prior art? Or is your proposal something completely new?

Qard commented 2 months ago

Yep, working on a cross-language context management design. Yet to be seen how well it applies universally, but an RFC for it exists and is getting a lot of engagement internally. This is part of the docs I plan to share soon.

In most languages APMs have invented their own context management system. This works for languages like Java and .NET where they have access to rewriting bytecode, but without that power we can't really do the same externally, thus the limitations of ALS as it is presently. This does also mean the shape of context in those languages is quite different at the moment.

I'll ask around within my team and see what can be shared from our other languages. Most of the team is still asleep at the moment though, so I'll get back to you on that...

I did get pointed to this related issue for .NET with AsyncLocal though: https://github.com/dotnet/runtime/issues/99153

Jamesernator commented 2 months ago

So is there any language or framework that has something like async/await and has a context propagation management that matches the behavior you're describing, so we can look more in depth into it and use it as prior art? Or is your proposal something completely new?

I've been looking into Python's implementation due to this thread and due to the fact that JS's generator support is basically the same as Python's, and it's largely similar to this proposal but offers some additional tools for the inner path (but probably not in a way that would be useful for tracing).

Now Python has a library called contextvars, this contains two main things that are essentially Variable and Snapshot in this proposal.

We have ContextVar which creates a variable:

from contextvars import * 
c_var = ContextVar("cVar")

in Python, contexts are fully map-like so you can get/set at any point once you've created the variable.

c_var.set(3)
print(c_var.get())

Like AsyncContext.Snapshot there is a way to snapshot the current context, and run things within that new context in the same way:

c_var = ContextVar("cVar")

def foo():
    c_var.set("in-foo")

c_var.set("in-main")
ctx = copy_context()
ctx.run(foo)
print(c_var.get()) # Prints in-main

Python doesn't have is the equivalent of asyncVar.run(v, fn), instead you create a new context and run a function in the context that sets those variables and then runs fn, i.e. to set say a priority you'd do:

priority_var = ContextVar("var")

def the_real_thing_to_run():
    print(priority_var.get()) # "high"

def run_with_priority():
    the_real_thing_to_run()

ctx = copy_context()
ctx.run(lambda: priority_var.set("high"))
ctx.run(run_with_priority)

(Functionally this all asyncVar.run(v, fn) even does internally anwyay).

Now you might've noticed Python's thing is just called Context, that's because there's nothing "async" about it, this thing is basically just SyncContext that has been mentioned in the slides for the proposal.

So how does this integrate with Python's async-await, well first you need to know that coroutines in Python aren't eager, calling the async function doesn't start the coroutine and coroutines don't have .then or similar methods, the coroutine is really nothing more than a generator. Instead in Python you create an event loop and dispatch coroutines onto the event loop.

As a consequence this does nothing whatsoever:

c_var = ContextVar("var")

async def foo():
    c_var.set("foo")

ctx = copy_context()
ctx.run(foo)
print(c_var.get()) # LookupError

However if you are running in an event loop, you can use context vars with .get()/.set().

c_var = ContextVar("var")

async def foo():
    c_var.set("foo")

async def main():
    c_var.set("main")
    await foo()
    print(c_var.get()) # prints foo

asyncio.run(main())

Before you think this is special, nothing particularly interesting is happening here, it's just like if we had a mutable asyncVar:

const cVar = new AsyncContext.Variable();

async function foo() {
    cVar.get().value = "foo";
}

async function main() {
    cVar.get().value = "main";
    await foo();
    console.log(cVar.get().value); // print foo
}

cVar.run({ value: null }, main);

Now where things do actually get special is with how Python deals with context containment in coroutines. Because coroutines are basically just generators by a different name (well it makes it easy to distinguish await vs yield in Python's async generators), all await really does is the same thing as yield* in JS.

Instead Python also has a type similar to Promise called Future, this has .then by another name (and without any flatmapping), this type has another subclass called Task which actually wraps a coroutine into a future. In order to give a task a separate context, instead of ctx.run(sub_task) you call asyncio.create_task(coroutine).

When you create a task this way, it automatically is like ctx.run and creates a new context for the task to run in, so this changes the above example:

c_var = ContextVar("var")

async def foo():
    c_var.set("foo")

async def main():
    c_var.set("main")
    await asyncio.create_task(foo())
    print(c_var.get()) # Now prints main

asyncio.run(main())

Now this looks like the semantics of this current proposal (restoring the context after await). However as of the most recent version of Python tasks also expose their context, which means it's possible to view the context from the task itself (you can also pass in a custom context, rather than defaulting to a new one, which is analagous to ctx.run from the sync case earlier):

c_var = ContextVar("var")

async def foo():
    c_var.set("foo")

async def main():
    c_var.set("main")
    task = asyncio.create_task(foo())
    await task
    print(c_var.get()) # main
    print(task.get_context().get(c_var)) # foo 

asyncio.run(main())

This means that yes contexts do live as long as their "promises", however you only retain as long as you actually have a first class reference to the task. So the inner path is available, but it's not really available to the sorts of things that would care because you need to explictly pass the task around to get at it's context.

Some general thoughts I have about Python's design here and how it relates to this proposal:

Qard commented 2 months ago

Yeah, basically every other language which has implemented context management has used implicit scopes and a simple setter/getter pattern rather than our run(value, scope) design. I personally preferred that approach when context management designs were first being proposed to Node.js, but some core folks rejected the idea as they felt Node.js had too vague a definition of where such barriers would exist. .NET does a simple AsyncLocal type with just a set(...) and get(...) on it. In fact, a direct port of that was one of the original proposals for context management design in Node.js core before it ultimately accepted AsyncLocalStorage.

Before you think this is special, nothing particularly interesting is happening here, it's just like if we had a mutable asyncVar:

It actually is special, and an example of what I'm asking for. While the shape of the API and the linearity of the code example makes it look like a simple mutable object, it's not just a plain assignment. It manages the flow into continuations and you can see it doing the same thing even when executing concurrently, meaning it's not just patching that value away there.

Calling asyncio.create_task(...) is how Python makes things actually run concurrently. It would otherwise not run until awaited and therefore would be actually linear. The fact that asyncio.create_task(...) contains context flow like it does is how they handle that management of context flows.

Consider this code:

from contextvars import ContextVar
import asyncio

c_var = ContextVar("var")

async def foo(name):
  c_var.set(name)

async def bar(name, n):
  c_var.set("bar")
  await foo(name)
  print("adjusted", name)
  await asyncio.sleep(n)
  print(name, c_var.get()) # prints name

async def main():
  task1 = asyncio.create_task(bar("first", 0))
  task2 = asyncio.create_task(bar("second", 0.1))
  await task1
  await task2

asyncio.run(main())

It will run the two tasks concurrently and will have the expected values reach the final print of each task because of the task isolation. It's certainly a bit different from our model, but it's definitely not the same as a mutable object as if it was the two branches would then be mutating and would not get the correct values due to the interleaving.

I'm not sure what insight there is to be gain from this exactly as the asynchrony model of Python itself is different and so layering context management on top of it will of course look different. πŸ€”

Qard commented 2 months ago

I think the source of my confusion with the resolve path thing with promises was a combination of https://github.com/tc39/proposal-async-context/issues/16 and https://github.com/nodejs/node/issues/46262.

Flarna commented 2 months ago

maybe, in both cases the agreement was to stick on capture at call time except for unhandledrejection.

Regarding set/get APIs. A .NET college complained quite a bit about this because it's often done wrong if used nested. While nested .run() calls enter/exit always in right order and restore context correct it's quite easy that the set calls to restore the context at the end are missed (e.g. exception, early exit,...) or called in wrong order.

Some languages have constructs like try with resources to get again a defined, exception safe enter/exit behavior but without a callback. A bit like explicit resource management proposal.

I don't know the python implementation, but at the time I worked on the AsyncLocal proposal for node I took a closer look into the .NET implementation. As far as I remember they clone the context object whenever set() is called. I think this is very similar to the AsyncContextFrame you used somewhere else.

What is really special in the .NET implementation is that it provides a hook)))) to get notified whenever the content of the variable changes. This enables various usecase - like invalidate the data in context if one notices that it gets unwanted propagated forever (e.g. via a setInterval(),...).

Qard commented 2 months ago

Regarding set/get APIs. A .NET college complained quite a bit about this because it's often done wrong if used nested. While nested .run() calls enter/exit always in right order and restore context correct it's quite easy that the set calls to restore the context at the end are missed (e.g. exception, early exit,...) or called in wrong order.

I don't advocate for set/get over run, that's just what most other systems seemed to do before AsyncLocalStorage, and many still use that pattern. It's generally okay, but it depends on the level of complexity in how the runtime manages tasks. Node.js happens to be fairly complex in that regard so it's easy to run into strange behaviours without clearly defining an end boundary.

As far as I remember they clone the context object whenever set() is called. I think this is very similar to the AsyncContextFrame you used somewhere else.

Yep, most context implementations I've seen do a copy-on-write thing, either at the instance level or with a shared underlying representation like AsyncContextFrame.

What is really special in the .NET implementation is that it provides a hook)))) to get notified whenever the content of the variable changes. This enables various usecase - like invalidate the data in context if one notices that it gets unwanted propagated forever (e.g. via a setInterval(),...).

I've been working on a language-agnostic proposal for context concepts which includes basically synchronous "windows" around parts of the execution which context applies to and integrates with a language-agnostic diagnostics channel concept which would emit before and after events around each window. I'll be sharing more very soon, just revising a few things in the doc and figuring out how to share it as our corporate Google Docs policy forbids public docs. πŸ˜…

legendecas commented 2 months ago

Yes, both keep the data alive, but pushing spans into other spans keeps more data alive, which significantly increases memory pressure. It's not a "leak" in the traditional sense, but users always call it a leak because requests suddenly take several times more memory space than usual. Observability, by design, captures data everywhere and so that adds up quickly unless we're very careful about flowing the minimum amount of data possible. This is especially a concern when dealing with highly asynchronous environments like Node.js--10k concurrent requests means 10k concurrent traces, each with potentially 1k+ spans.

I am confused by the statement that it is the design of the context propagation caused the memory pressure. I would find that the OpenTelemetry SpanContext which would be saved in a ContextVariable prevents a span from alive indefinitely, whilst preserves the ability to formulate corelations between spans. And this can be built on top of proposed AsyncContext.Variable.

Qard commented 1 month ago

OpenTelemetry is actually a perfect example of the problem--the JS version has a context manager which stores all pieces of data in one giant map which needs to be copied and modified every time any context data would change rather than using a different store instance per context key.

This means everything needs to flow together and be accessed together. It means storing large objects and doing this extra work of navigating these objects every time context is accessed rather than splitting into many variables.

Context systems need to be multi-tenant because there will undoubtedly be multiple users within most complex applications. But then if individual users go and only use one store instance for all their data then they are doing one level of lookup to get their store among all the stores and then another lookup to pull their key out of the map they've stuff into that store whereas if there was just one level with many stores that directly represented individual pieces of data then it'd be faster, simpler, and spend substantially less time doing map copies at every change and needing to navigate all the GC reference complexity that adds. If you're just directly pointer-copying most things and only changing the stores that actually need changing then it's just a whole lot better from both a memory and computation cost perspective.

shicks commented 1 month ago

@andreubotella (from way upthread)

Do APMs need to be able to access the contexts for all of the promises resolved with Promise.all? If so, would this be special-cased to Promise.all, or are there userland APIs that logically join multiple contexts that would also need this capability? What about multiple originating sources for a web API event, like the <video> case I mention in #82?

Edit: Also, how would multiple calling contexts flow through the data flow graph?

Computed signals can logically have multiple calling contexts:

const sig1 = new Signal.State(1);
const sig2 = new Signal.State(2);
const sum = new Signal.Computed(() => sig1.get() + sig2.get() + var1.get());
var1.run(3, () => sig1.set(4));
var1.run(5, () => sig2.set(6));

Computed signals are evaluated lazily, so it logically has two causes, and those contexts would presumably need to be merged somehow (exactly how probably depends on the specific variable).


Reading through this thread, it seems like we've been talking past each other for most of it. Interestingly, I'm understanding @Qard to be saying that preserving the context mapping across await doesn't get you anything you can't already basically do with a lexical variable, but that you need the non-preserving behavior to track the "follows-from" case in a reasonable way. On the other hand, my understanding (and I believe many others) is the exact opposite - that not preserving the context mapping across await doesn't get you anything you can't already basically do with a mutable global variable (and/or mutable state within an AsyncContext.Variable) - if the VM isn't handling the snapshot swapping across awaits, then I don't really understand what it's doing that isn't pretty easy to do in userland?

In fact, tracing is one of my goals for AsyncContext, and mutating state on the root trace is exactly how we're planning to handle this case. But we're only looking at web, so we don't have the kind of QPS that servers would need to worry about.

Jamesernator commented 1 month ago

Computed signals can logically have multiple calling contexts:

Even multiple calling contexts aside, this example is a good reason why "return-like" variables should be able to flow up the stack even in synchronous cases.


I went ahead and wrote up some possible semantics for the two-kinds-of-flow suggestion I made above which should help for arguing about precise semantics for how these flows actually work.

Here, "parameter variables" work the same way the proposal already does today and have the same methods. "Return variables" work more like is being proposed here in the OP, essentially the context at resolve is what is restored after an await.

(Also obviously I can't change the builtin async/await, so "async-await" is just implemented here with the classic generator approach to demonstrate the proposed semantics/rough implementation).

Example usages in async await, event integration, and Promise.all are shown at the bottom.

namespace AsyncContext {
    type ParameterMapping = Map<Variable<any>, any>;

    export type GetReturn<T> =
        | { kind: "multiple"; values: Array<GetReturn<T>> }
        | { kind: "one"; value: T | undefined };

    class ReturnMappings {
        static ofMappings(childMappings: ReadonlyArray<ReturnMappings>): ReturnMappings {
            return new ReturnMappings(new Map(), childMappings);
        }

        readonly #mappings: Map<Variable<any>, any>;
        readonly #childMappings: ReadonlyArray<ReturnMappings>;

        constructor(
            mappings = new Map<Variable<any>, any>(),
            childMappings: ReadonlyArray<ReturnMappings> = [],
        ) {
            this.#mappings = mappings;
            this.#childMappings = childMappings;
        }

        #get(variable: Variable<any>): GetReturn<any> {
            if (this.#mappings.has(variable)) {
                return { kind: "one", value: this.#mappings.get(variable) };
            }
            return {
                kind: "multiple",
                values: this.#childMappings.map((mapping) => {
                    return mapping.#get(variable);
                }),
            };
        }

        get(variable: Variable<any>): GetReturn<any> {
            return this.#get(variable);
        }

        #without(variable: Variable<any>): ReturnMappings {
            const newMapping = new Map([...this.#mappings]);
            newMapping.delete(variable);
            return new ReturnMappings(
                newMapping,
                this.#childMappings.map((mapping) => {
                    return mapping.#without(variable);
                }),
            );
        }

        without(variable: Variable<any>): ReturnMappings {
            return this.#without(variable);
        }

        withSet(variable: Variable<any>, value: any): ReturnMappings {
            return new ReturnMappings(
                new Map([...this.#mappings, [variable, value]]),
                // We clean the inner sets to avoid holding onto dead values, in practice this
                // data structure is just not very good requiring this much enumeration to set
                // values
                this.#childMappings.map((mapping) => {
                    return mapping.#without(variable);
                }),
            );
        }
    }

    export let activeParameterMapping: ParameterMapping = new Map();
    export let activeReturnMappings: ReturnMappings = new ReturnMappings();

    export type VariableOptions<T> = {
        name?: string | undefined;
        defaultValue?: T;
    };

    export class Variable<T> {
        readonly #name: string;
        readonly #defaultValue: T | undefined;

        constructor({ name = "", defaultValue }: VariableOptions<T> = {}) {
            this.#name = name;
            this.#defaultValue = defaultValue;
        }

        get name(): string {
            return this.#name;
        }

        get(): T | undefined {
            if (activeParameterMapping.has(this)) {
                return activeParameterMapping.get(this);
            }
            return this.#defaultValue;
        }

        getReturn(): GetReturn<T> {
            return activeReturnMappings.get(this);
        }

        run<Args extends ReadonlyArray<any>, Return>(
            value: T,
            cb: (...args: Args) => Return,
            ...args: Args
        ): Return {
            const previousParameterMapping = activeParameterMapping;

            try {
                activeParameterMapping = new Map([...activeParameterMapping, [this, value]]);
                return cb(...args);
            } finally {
                activeParameterMapping = previousParameterMapping;
            }
        }

        setReturn(value: T): void {
            activeReturnMappings = activeReturnMappings.withSet(this, value);
        }

        deleteReturn(): void {
            activeReturnMappings = activeReturnMappings.without(this);
        }

        setReturnIn<Args extends ReadonlyArray<any>, R>(
            value: T,
            fn: (...args: Args) => R,
            ...args: Args
        ): R {
            const previousReturnMappings = activeReturnMappings;
            try {
                activeReturnMappings = activeReturnMappings.withSet(this, value);
                return fn(...args);
            } finally {
                activeReturnMappings = previousReturnMappings;
            }
        }
    }

    export type SnapshotOptions = {
        captureInVariables?: boolean | undefined;
        captureOutVariables?: boolean | undefined;
    };

    export function setReturnAll(snapshots: ReadonlyArray<Snapshot>): void {
        activeReturnMappings = ReturnMappings.ofMappings(
            snapshots
                .map((snapshot) => {
                    return getReturnMappings(snapshot);
                })
                .filter((mapping): mapping is ReturnMappings => mapping !== null),
        );
    }

    let getReturnMappings: (snapshot: Snapshot) => ReturnMappings | null;

    export class Snapshot {
        static wrap<This, Args extends ReadonlyArray<any>, Return>(
            fn: (this: This, ...args: Args) => Return,
            options: SnapshotOptions = {},
        ): (this: This, ...args: Args) => Return {
            const snapshot = new Snapshot(options);
            return function wrapper(this: This, ...args: Args): Return {
                return snapshot.run(() => {
                    return fn.call(this, ...args);
                });
            };
        }

        static {
            getReturnMappings = (snapshot) => snapshot.#returnMappings;
        }

        readonly #parameterMapping: ParameterMapping | null;
        readonly #returnMappings: ReturnMappings | null;

        constructor({
            captureInVariables = false,
            captureOutVariables = false,
        }: SnapshotOptions = {}) {
            if (!captureInVariables && !captureOutVariables) {
                throw new RangeError(`at least one type of async variable should be captured`);
            }
            this.#parameterMapping = captureInVariables ? activeParameterMapping : null;
            this.#returnMappings = captureOutVariables ? activeReturnMappings : null;
        }

        run<Args extends ReadonlyArray<any>, Return>(
            cb: (...args: Args) => Return,
            ...args: Args
        ): Return {
            const previousParameterMapping = activeParameterMapping;
            const previousReturnMappings = activeReturnMappings;

            try {
                activeParameterMapping = this.#parameterMapping ?? activeParameterMapping;
                activeReturnMappings = this.#returnMappings ?? activeReturnMappings;
                return cb(...args);
            } finally {
                activeParameterMapping = previousParameterMapping;
                activeReturnMappings = previousReturnMappings;
            }
        }
    }

    type PromisePendingState<T> = {
        kind: "pending";
        fulfilledCallbacks: Array<(value: T) => void>;
        rejectedCallbacks: Array<(error: any) => void>;
    };

    type PromiseFulfilledState<T> = {
        kind: "fulfilled";
        resolveContext: Snapshot;
        value: T;
    };

    type PromiseRejectedState = {
        kind: "rejected";
        resolveContext: Snapshot;
        error: any;
    };

    type PromiseState<T> = PromiseFulfilledState<T> | PromisePendingState<T> | PromiseRejectedState;

    export class Promise<T> {
        static #isPromise(value: any): value is Promise<any> {
            return value != null && #status in value;
        }

        static #resolve<T>(value: Promise<T> | T): Promise<T> {
            if (Promise.#isPromise(value)) {
                return value;
            }
            return new Promise<T>((resolve, reject) => {
                resolve(value as T);
            });
        }

        static #try<Args extends ReadonlyArray<any>, Return>(
            fn: (...args: Args) => Promise<Return> | Return,
            ...args: Args
        ): Promise<Return> {
            return new Promise((resolve, reject) => {
                const p = Promise.#resolve(fn(...args));
                p.#subscribe(resolve, reject);
            });
        }

        static all<T extends ReadonlyArray<unknown> | []>(
            values: T,
        ): Promise<{ -readonly [P in keyof T]: Awaited<T[P]> }> {
            return new Promise((resolve, reject) => {
                let pendingResults = 0;
                const results: Array<any> = [];
                const resolveContexts: Array<Snapshot | null> = [];
                for (const [index, value] of values.entries()) {
                    pendingResults += 1;
                    results.push(null);
                    resolveContexts.push(null as any);
                    Promise.resolve(value).then((value) => {
                        resolveContexts[index] = new Snapshot({ captureOutVariables: true });
                        pendingResults -= 1;
                        results[index] = value;
                        if (pendingResults === 0) {
                            AsyncContext.setReturnAll(resolveContexts as any);
                            resolve(results as any);
                        }
                    }, reject);
                }
            });
        }

        static resolve<T>(value: Promise<T> | T): Promise<T> {
            return Promise.#resolve(value);
        }

        static reject<T = never>(error: any): Promise<T> {
            return new Promise((resolve, reject) => {
                reject(error);
            });
        }

        #status: PromiseState<T> = {
            kind: "pending",
            fulfilledCallbacks: [],
            rejectedCallbacks: [],
        };

        constructor(init: (resolve: (value: T) => void, reject: (error: any) => void) => void) {
            try {
                init(
                    (value) => {
                        if (this.#status.kind !== "pending") {
                            return;
                        }
                        const { fulfilledCallbacks } = this.#status;
                        // NEW: Store the active return context along with the resolved value to
                        // be called with on future calls of .then
                        const resolveContext = new Snapshot({ captureOutVariables: true });
                        this.#status = { kind: "fulfilled", value, resolveContext };
                        for (const onFulfilled of fulfilledCallbacks) {
                            globalThis.queueMicrotask(() => {
                                resolveContext.run(onFulfilled, value);
                            });
                        }
                    },
                    (error) => {
                        if (this.#status.kind !== "pending") {
                            return;
                        }
                        const { rejectedCallbacks } = this.#status;
                        // NEW: Same as in resolve
                        const resolveContext = new Snapshot({ captureOutVariables: true });
                        this.#status = { kind: "rejected", error, resolveContext };
                        for (const onRejected of rejectedCallbacks) {
                            globalThis.queueMicrotask(() => {
                                resolveContext.run(onRejected, error);
                            });
                        }
                    },
                );
            } catch (error: unknown) {
                if (this.#status.kind === "pending") {
                    // Capture the context for synchronous rejection
                    this.#status = {
                        kind: "rejected",
                        error,
                        resolveContext: new Snapshot({ captureOutVariables: true }),
                    };
                }
            }
        }

        #subscribe(
            onFulfilled: ((value: T) => void) | null,
            onRejected: ((error: any) => void) | null,
        ): void {
            if (this.#status.kind === "fulfilled") {
                if (onFulfilled) {
                    const { value, resolveContext } = this.#status;
                    globalThis.queueMicrotask(() => {
                        resolveContext.run(onFulfilled, value);
                    });
                }
            } else if (this.#status.kind === "rejected") {
                if (onRejected) {
                    const { error, resolveContext } = this.#status;
                    globalThis.queueMicrotask(() => {
                        resolveContext.run(onRejected, error);
                    });
                }
            } else {
                if (onFulfilled) {
                    this.#status.fulfilledCallbacks.push(onFulfilled);
                }
                if (onRejected) {
                    this.#status.rejectedCallbacks.push(onRejected);
                }
            }
        }

        then<TResult1 = T, TResult2 = never>(
            onFulfilled?: ((value: T) => Promise<TResult1> | TResult1) | null | undefined,
            onRejected?: ((reason: any) => Promise<TResult2> | TResult2) | null | undefined,
        ): Promise<TResult1 | TResult2> {
            onFulfilled =
                onFulfilled ?
                    Snapshot.wrap(onFulfilled, { captureInVariables: true })
                :   onFulfilled;
            onRejected =
                onRejected ? Snapshot.wrap(onRejected, { captureInVariables: true }) : onRejected;

            return new Promise((resolve, reject) => {
                this.#subscribe(
                    (value) => {
                        if (onFulfilled) {
                            const p = Promise.#try(onFulfilled, value);
                            p.#subscribe(resolve, reject);
                        } else {
                            resolve(value as unknown as TResult1);
                        }
                    },
                    (error) => {
                        if (onRejected) {
                            const p = Promise.#try(onRejected, error);
                            p.#subscribe(resolve, reject);
                        } else {
                            reject(error);
                        }
                    },
                );
            });
        }
    }

    // NOTE: There is actually nothing in this implementation that cares about AsyncContext
    // all behaviour falls out as a consequence of how Promise.then is defined above
    export function async<This, Args extends ReadonlyArray<any>, Return>(
        genFunc: (this: This, ...args: Args) => Generator<any, Return, any>,
    ): (this: This, ...args: Args) => Promise<Return> {
        return function asyncWrapper(...args: Args): Promise<Return> {
            return new Promise<Return>((resolve, reject) => {
                const gen = genFunc.call(this, ...args);

                function continueAsync(value: any, isError: boolean): void {
                    try {
                        const iterResult = isError ? gen.throw(value) : gen.next(value);

                        if (iterResult.done) {
                            resolve(iterResult.value);
                        } else {
                            Promise.resolve(iterResult.value).then(
                                (value) => {
                                    continueAsync(value, false);
                                },
                                (error) => {
                                    continueAsync(error, true);
                                },
                            );
                        }
                    } catch (error: unknown) {
                        reject(error);
                    }
                }

                continueAsync(undefined, false);
            });
        };
    }

    export function queueMicrotask(cb: () => void): void {
        globalThis.queueMicrotask(
            Snapshot.wrap(cb, { captureInVariables: true, captureOutVariables: true }),
        );
    }

    export function setTimeout(cb: () => void, time: number): any {
        return globalThis.setTimeout(
            Snapshot.wrap(cb, { captureInVariables: true, captureOutVariables: true }),
            time,
        );
    }

    export function setInterval(cb: () => void, time: number): any {
        return globalThis.setInterval(
            Snapshot.wrap(cb, { captureInVariables: true, captureOutVariables: true }),
            time,
        );
    }
}

export default AsyncContext;

// Example with an event-like system
{
    class EventsLike<T> {
        readonly #listeners: Array<(v: T) => void> = [];

        addListener(listener: (v: T) => void): void {
            this.#listeners.push(
                AsyncContext.Snapshot.wrap(listener, { captureInVariables: true }),
            );
        }

        notify(v: T): void {
            for (const listener of this.#listeners) {
                queueMicrotask(
                    AsyncContext.Snapshot.wrap(() => void listener(v), {
                        captureOutVariables: true,
                    }),
                );
            }
        }
    }

    const e = new EventsLike<number>();

    const paramVar = new AsyncContext.Variable<string>();
    const returnVar = new AsyncContext.Variable<string>();

    paramVar.run("init", () => {
        e.addListener((v) => {
            console.log(`paramVar: `, paramVar.get());
            console.log(`returnVar: `, returnVar.getReturn());
        });
    });

    returnVar.setReturnIn("notify", () => {
        e.notify(10);
    });
}

// Example with pure promises
{
    const paramVar = new AsyncContext.Variable<string>();
    const returnVar = new AsyncContext.Variable<string>();

    const promise = new AsyncContext.Promise<number>((resolve) => {
        globalThis.setTimeout(() => {
            returnVar.setReturnIn("resolve", () => {
                resolve(10);
            });
        }, 10);
    });

    await paramVar.run("init", () => {
        return promise.then((v) => {
            console.log(`paramVar: `, paramVar.get());
            console.log(`returnVar: `, returnVar.getReturn());
        });
    });
}

// Example with "async await"
{
    const returnVar = new AsyncContext.Variable<string>();
    const paramVar = new AsyncContext.Variable<string>();
    const mixedVar = new AsyncContext.Variable<string>();

    const f = AsyncContext.async(function* f(): Generator<any, void, any> {
        yield new AsyncContext.Promise((resolve) => {
            globalThis.setTimeout(() => {
                mixedVar.setReturnIn("resolve mixed", () => {
                    returnVar.setReturnIn("resolve", () => {
                        resolve(null);
                    });
                });
            });
        });
        console.log(`paramVar: `, paramVar.get());
        console.log(`returnVar: `, returnVar.getReturn());
        console.log(`mixedVar param: `, mixedVar.get());
        console.log(`mixedVar return: `, mixedVar.getReturn());
    });

    await paramVar.run("init", () => {
        return mixedVar.run("mixed init", () => {
            return f();
        });
    });
}

// Example with Promise.all

{
    const returnVar = new AsyncContext.Variable<string>();
    const paramVar = new AsyncContext.Variable<string>();

    await paramVar.run("init", () => {
        return AsyncContext.Promise.all([
            new AsyncContext.Promise((resolve) => {
                returnVar.setReturnIn("foo", () => {
                    resolve("foofoo");
                });
            }),
            (async () => {
                returnVar.setReturn("bar");
                return "barbar";
            })(),
        ]).then(() => {
            console.log(`paramVar: `, paramVar.get());
            console.log(`returnVar: `, returnVar.getReturn());
            returnVar.setReturn("new return");
        });
    });
}
Qard commented 1 month ago

On the other hand, my understanding (and I believe many others) is the exact opposite - that not preserving the context mapping across await doesn't get you anything you can't already basically do with a mutable global variable (and/or mutable state within an AsyncContext.Variable) - if the VM isn't handling the snapshot swapping across awaits, then I don't really understand what it's doing that isn't pretty easy to do in userland?

Mutable global are insufficient because between when the promise-returning function resolves internally and when the await resumes there could be other things that happen, so we specifically need a solution to bind between that promise resolve and its continuation. There currently are no solutions to this, not even patching the world, because await will insert microtasks we can't patch.