Closed yairk closed 4 months ago
@vdeturckheim, @quard FYI
@vdeturckheim, @Quard will you be able to weigh in on the above?
Hey, sorry for the late reply, for some reason I missed the ping. I need to run a couple tests - I will do it tonight an circle back :)
So, this is somehow an expected behavior tbh, I slightly changed the code to
'use strict';
'use strict';
const async_hooks = require('async_hooks');
const fs = require('fs');
const util = require('util');
let asyncLocalStorage = new async_hooks.AsyncLocalStorage();
asyncLocalStorage.enterWith({blah: 7});
function log(...args) {
fs.writeSync(1, `${util.format(...args)}\n`);
}
function getCurrentId() {
return async_hooks.executionAsyncId();
}
async_hooks.createHook({
init(asyncId, type, triggerAsyncId, resource) {
log(`${triggerAsyncId} => ${asyncId}`);
}
}).enable();
log(`before all async id: ${async_hooks.executionAsyncId()}`);
Promise.resolve()
.then(() => {
log(`first promise async id: ${async_hooks.executionAsyncId()}`);
log(`first promise: ${getCurrentId()}`);
log(`before enterWith, id in first promise=${asyncLocalStorage.getStore().id} and blah=${asyncLocalStorage.getStore().blah}`);
asyncLocalStorage.enterWith({id: 7});
log(`id in first promise=${asyncLocalStorage.getStore().id} and blah=${asyncLocalStorage.getStore().blah}`);
log(`first promise done`);
})
.then(() => {
log(`second promise async id: ${async_hooks.executionAsyncId()}`);
log(`second promise: ${getCurrentId()}`);
log(`id in second promise=${asyncLocalStorage.getStore().id} and blah=${asyncLocalStorage.getStore().blah}`);
});
and it logs
before all async id: 1
1 => 2
2 => 3
3 => 4
first promise async id: 3
first promise: 3
before enterWith, id in first promise=undefined and blah=7
id in first promise=7 and blah=undefined
first promise done
second promise async id: 4
second promise: 4
id in second promise=undefined and blah=7
Basically, the second promise async scope is created before the first one ran. At this point, they share the same object as strore. When updating the store in the promise, this was not spread because there was no new call to init.
That's a bit why I actually recommend to always avoid using enterWith
and use run
as much as possible (the callback-using syntax of run
makes it way harder to fall in these situations).
I see a few hacks we could use in the API to fix this behavior but I don't see any being memory safe or that would not introduce other complicated cases. Maybe @Qard or @Flarna can see something I miss.
(cc @bmeck I think this is one of the cases that concerns you regarding this API).
correct
This is expected behaviour. The init
of each promise is when .then(...)
is called, not when the callback given to it is called. AsyncLocalStorage
only follows the init
path of async_hooks
, while what you seem to want here is for it to follow the promiseResolve
path. Even run(...)
is not going to do what you want here:
const als = new AsyncLocalStorage()
als.run({ foo: 'bar' }, () => {
Promise.resolve()
.then(() => {
return als.run({ foo: 'baz' }, () => Promise.resolve())
})
.then(() => {
console.log(als.getStore()) // contains foo: bar
})
})
@Qard @vdeturckheim thanks for looking into this! Indeed, this matches my experience that AsyncLocalStorage
only propagates its store at the time of an init
.
I encountered this when creating tests using jest
(which is based on jasmine
). Essentially I found that if I set the store in a beforeEach
hook, it would not be visible to the test body. Turns out that jasmine
first builds a Promise chain and only then starts invoking the hooks and the test.
I'm tempted to implement my own flavor of AsyncLocalStorage
which would also propagate the store in a PromiseResolve
hook. So in this example here it would use the init
hook to keep track of the parent-child relationships, and once the first Promise is resolved, it would propagate the store to the child contexts created by this Promise.
Are there caveats here that I'm not anticipating? This is my first time working with async hooks
so my thinking here could be completely wrong.
That sounds reasonable to me. You will just need to make sure you have a way to get the promise resource from the promiseResolve hook so you can propagate the context at that point. It might also make sense as an extra setting for AsyncLocalStorage to propagate in that way, if desired. Just be aware that there is a very specific reason this is not the default behaviour. Allow me to explain.
A promise is conceptually not itself async per-se, but rather it can be seen as an abstraction of something else async like an event emitter is sync until you emit from within some other asyncrony. But the fact that a promise handler is scheduled on the microtask queue makes it a form of queueing system. Normally the suggested way to handle function queuing systems is to wrap them with AsyncResource. Due to the low-level nature of promises this is not possible so we instead use PromiseHook which roughly corresponds to the async_hooks lifecycle events.
But why does init happen where it does, and why is promiseResolve a separate thing? From the user perspective, AsyncResource ensures a context links back to the user code which supplied the callback and not to internals somewhere such as a connection pool which could potentially trigger from a completely unrelated context. It's the same case here in that what happens within a then handler is internal to the promise whereas the attachment of that then should link back to the context in which it was attached. This problem mostly goes away in async/await as the continuations are effectively nested rather than chained, due to how the spec works, making the outer resolve and the init effectively the same context.
The graph produced by async_hooks is not a pure logical graph, it is an abstract conceptual graph, expressing the structure of the program as seen by the user, not as understood by the underlying system which views it purely as a sequence of interleaved events received from the event loop and mapped to callbacks.
I (think I) get what you mean by async_hooks
producing a graph that corresponds to the structure seen by the user.
The way I'm currently thinking about my use case is that I want a particular piece of context to travel across "scheduling details" such as async/await, Promises, queues, events, etc. Seems like this is in conflict with your description of the way async_hooks
works - from my perspective, I don't care when the Promise context is created; I only care about the actual execution of my code and the places where it "branches". From this perspective, moving to the next Promise in the chain should not lose the context.
If I understand you correctly, you're saying that AsyncResource
is the right tool for this job, but that it doesn't work well with Promises? Are PromiseHooks exposed to Javascript code, or are they considered internal?
As for enhancing (or "forking") AsyncLocalStorage
, I think this can create problems if some of the Promises do async work, for example:
'use strict';
const async_hooks = require('async_hooks');
const fs = require('fs');
const util = require('util');
let asyncLocalStorage = new async_hooks.AsyncLocalStorage();
asyncLocalStorage.enterWith({blah: 7});
function log(...args) {
fs.writeSync(1, `${util.format(...args)}\n`);
}
async_hooks.createHook({
init(asyncId, type, triggerAsyncId, resource) {
log(`${triggerAsyncId} => ${asyncId}`);
},
promiseResolve(asyncId) {
log(`resolve ${asyncId}`);
}
}).enable();
log(`before all async id: ${async_hooks.executionAsyncId()}`);
Promise.resolve()
.then(async () => {
log(`first promise async id: ${async_hooks.executionAsyncId()}`);
await new Promise((res, rej) => setTimeout(() => res(), 1000)); // <-- creates a new async context
asyncLocalStorage.enterWith({id: 7});
log(`first promise async id: ${async_hooks.executionAsyncId()}`);
log(`id in first promise=${asyncLocalStorage.getStore().id} and blah=${asyncLocalStorage.getStore().blah}`);
log(`first promise done`);
})
.then(() => {
log(`second promise async id: ${async_hooks.executionAsyncId()}`);
log(`id in second promise=${asyncLocalStorage.getStore().id} and blah=${asyncLocalStorage.getStore().blah}`);
});
Now, the second Promise runs in a child context of the first Promise, but enterWith
is running in a sibling context to the second Promise. So it's not strictly about parent-child relationships anymore...
I can probably figure out how to deal with this scenario, but this again makes me wary that the real problem here is with what I expect, not with the existing language mechanisms.
PromiseHook is part of V8. I'm working on a rewrite which should (eventually) enable it to be exposed in JavaScript cleanly, as a separate API, but it's not there yet. Currently it has a partly JavaScript-facing internal API, but only conditionally, so it's not yet possible for it to be used directly.
As for context flowing through promises, there's two ways to look at that: what you want is the internal path which flows through the resolving logic, whereas what it does now is step around that, binding the callback directly to the point where then(...) was called. This is important because a promise is a container which can be passed around and then(...) could be run from anywhere. If you were to flow context from the resolve path, that path might be entirely unrelated to the actual user code path, for example, a common pattern in database libraries is to create a connection promise out-of-band from a request, and do a then(...) continuation on it whenever the connection is needed. Flowing from the resolve path in this case would point outside of the request, which would be incorrect. By linking context of a then/catch handler directly to the point where it is called, it ensures the context within that callback is conceptually valid in expressing the relation in user code between the point where the handler is attached and the point where it is run. It's still possible to interact with the stored object if the promise resolution is a direct result of actions with the request, but replacing the object will not flow into the handler because the resolve and the corresponding promise are decoupled with a microtask barrier between when the resolve actually happens and when the handler runs--the handler could theoretically be put off indefinitely even though the resolve has already happened, depending on interactions with the microtask queue.
To your database example, I understand why one would expect the ALS store to be propagated during init
, but I would argue that it would make sense to also propagate it across the resolution path. If the DB library created a connection promise and attached a then
continuation, I think it's very plausible that any ALS store updates made in the connection promise, should also be propagated to the continuation. And for cases where this is not desired, we can have another construct with similar semantics to AsyncLocalStorage.run
which would run a piece of code under a certain context without "leaking" it to the rest of the promise.
I played some more with async_hooks
and realized it can be challenging for me to propagate the context correctly across Promise resolutions. For example, with the following code:
'use strict';
const async_hooks = require('async_hooks');
const fs = require('fs');
const util = require('util');
let asyncLocalStorage = new async_hooks.AsyncLocalStorage();
asyncLocalStorage.enterWith({blah: 7});
function log(...args) {
fs.writeSync(1, `${util.format(...args)}\n`);
}
async_hooks.createHook({
init(asyncId, type, triggerAsyncId, resource) {
log(`${triggerAsyncId} => ${asyncId} (${type})`);
},
promiseResolve(asyncId) {
log(`resolve ${asyncId}`);
},
after(asyncId) {
log(`after ${asyncId}`);
},
destroy(asyncId) {
log(`destroy ${asyncId}`);
},
}).enable();
log(`before all async id: ${async_hooks.executionAsyncId()}`);
Promise.resolve()
.then(async () => {
// runs in context id 3
log(`first promise async id: ${async_hooks.executionAsyncId()}`);
await new Promise((res, rej) => setTimeout(() => res(), 1000));
// runs in context id 8 (grandparent: 3)
Promise.resolve()
.then(() => {
// runs in context id 11 (grandparent: 8)
log(`subpromise: ${async_hooks.executionAsyncId()}`);
});
// still runs in context id 8 (grandparent: 3)
asyncLocalStorage.enterWith({id: 7});
log(`first promise async id: ${async_hooks.executionAsyncId()}`);
log(`id in first promise=${asyncLocalStorage.getStore().id} and blah=${asyncLocalStorage.getStore().blah}`);
log(`first promise done`);
})
.then(() => {
// runs in context id 4 (parent: 3)
log(`second promise async id: ${async_hooks.executionAsyncId()}`);
log(`id in second promise=${asyncLocalStorage.getStore().id} and blah=${asyncLocalStorage.getStore().blah}`);
});
I get the following output:
before all async id: 1
1 => 2 (PROMISE)
resolve 2
2 => 3 (PROMISE)
3 => 4 (PROMISE)
3 => 5 (PROMISE)
first promise async id: 3
3 => 6 (PROMISE)
3 => 7 (Timeout)
6 => 8 (PROMISE)
resolve 3
after 3
5 => 9 (PROMISE)
resolve 6
after 7
8 => 10 (PROMISE)
resolve 10
10 => 11 (PROMISE)
first promise async id: 8
id in first promise=7 and blah=undefined
first promise done
resolve 5
resolve 8
after 8
subpromise: 11
resolve 11
after 11
resolve 3
resolve 9
after 9
second promise async id: 4
id in second promise=undefined and blah=7
resolve 4
after 4
destroy 7
So now I set my store in the first Promise, but in a context that's a grandchild of the Promise context. That context is not a parent of the context of the second Promise. Rather, they are siblings (or even cousin and uncle...). So this means I have to do a more sophisticated relationship-analysis when deciding where I need to propagate the store upon promiseResolve
.
To complicate things further, I wouldn't expect store updates done within the sub-Promise to propagate to the second Promise, but I can't come up with a good mechanism for ensuring this, without relying on the exact context parent/child tree which seems incredibly fragile. I thought that maybe the after
or destroy
hooks would help me see where a context was completely destroyed, but I added the logs and I don't see a way to use them.
It seems like the only thing to do at this point is to keep in mind the potential pitfalls and use the API very carefully. I'm also looking at https://www.npmjs.com/package/async-local-storage, which is similar to AsyncLocalStorage
with one key difference that during a store lookup it dynamically traverses the parent/child tree to find where the store was set. This solves my scenario, with the cost of introducing other pitfalls (sometimes changing the store will affect contexts that were created before the store update).
hey @yairk any update about the issue? We are experience the same in node 16.17 and jest. If we init AsyncLocalStorage
in before each
its being lost in the second test suite.
Note: jest is ran with --runInBand
hey @yairk any update about the issue? We are experience the same in node 16.17 and jest. If we init
AsyncLocalStorage
inbefore each
its being lost in the second test suite. Note: jest is ran with--runInBand
@ezekiel25-17 we did not find a way to do this purely with AsyncLocalStorage
. We ended up creating a hack: we created a wrapper around AsyncLocalStorage
that either retrieves data from ALS or from a global variable that is only set in the tests...
Hi everyone 👋
I've been running into this kind of scenario with enterWith
too. However, I noticed something even more strange when using it.
Here are two examples:
Maybe through your testing, someone could shed some light on why the second one acts like this.
Note that both examples are run using Node 16.18.0 without any transpiler (which I initially thought could be messing with the compiled code).
const { AsyncLocalStorage } = require('async_hooks');
const als = new AsyncLocalStorage();
/** Runs the main code and reads the state of the AsyncLocalStorage after execution */
async function main() {
console.log('main 1, store:', als.getStore());
await doSomething();
console.log('main 2, store:', als.getStore());
}
/**
* Sets the AsyncLocalStorage value with `enterWith` for the remainder of the current synchronous execution
* https://nodejs.org/api/async_context.html#asynclocalstorageenterwithstore
*/
function prepareALS(value) {
als.enterWith(value);
}
/** Play with the store */
async function doSomething() {
console.log('doSomething 1, store:', als.getStore());
prepareALS(1);
console.log('doSomething 2, store:', als.getStore());
prepareALS(2);
console.log('doSomething 3, store:', als.getStore());
prepareALS(3);
console.log('doSomething 4, store:', als.getStore());
}
main()
.then(() => console.log('done'))
.catch((err) => console.error(err));
Resulting logs:
main 1, store: undefined <= expected, no value set yet
doSomething 1, store: undefined <= expected, no value set yet
doSomething 2, store: 1 <= expected, value just set to 1 in the current context
doSomething 3, store: 2 <= expected, value just set to 2 in the current context
doSomething 4, store: 3 <= expected, value just set to 3 in the current context
main 2, store: 3 <= expected, value was lastly set to 3 in the synchronous context
done
Notice that only the prepareALS
function was changed from sync to async
const { AsyncLocalStorage } = require('async_hooks');
const als = new AsyncLocalStorage();
/** Runs the main code and reads the state of the AsyncLocalStorage after execution */
async function main() {
console.log('main 1', als.getStore());
await doSomething();
console.log('main 2', als.getStore());
}
/**
* Sets the AsyncLocalStorage value with `enterWith` for the remainder of the current synchronous execution
* https://nodejs.org/api/async_context.html#asynclocalstorageenterwithstore
*/
async function prepareALS(value) {
als.enterWith(value);
}
/** Play with the store */
async function doSomething() {
console.log('doSomething 1', als.getStore());
await prepareALS(1);
console.log('doSomething 2', als.getStore());
await prepareALS(2);
console.log('doSomething 3', als.getStore());
await prepareALS(3);
console.log('doSomething 4', als.getStore());
}
main()
.then(() => console.log('done'))
.catch((err) => console.error(err));
Resulting logs:
main 1, store: undefined <= expected, no value set yet
doSomething 1, store: undefined <= expected, no value set yet
doSomething 2, store: 1 <= expected, value just set to 1 in the current context
doSomething 3, store: 2 <= expected, value just set to 2 in the current context
doSomething 4, store: 3 <= expected, value just set to 3 in the current context
main 2, store: 1 <= ?! unexpected
done
So at the end of the second example, I see two possible coherent results:
undefined
if it actually "Transitions into the context for the remainder of the current synchronous execution" (cf the official Node's documentation about enterWith
)However, the ALS ends up keeping the value set in the first asynchronous call for some reason that I can't explain. Thanks for your help.
You changed a bit more then prepareALS
- you changed also doSomething
do await
it.
each await
creates a a few promises and async operations.
Only the call to await prepareALS(1)
is on the sync path therefore this value is finally on the store at the end of main()
.
All the others run in a new async operation and as a result the enterWith()
calls done there have no impact to the initial sync operation.
Hi @Flarna. Thanks for the quick reply and the explanation, that makes sense 👍
So I've finally managed to reach the state that I wanted through a "simple" workaround and ensure the last value set on the AsyncLocalStorage, even in a further Promise, is properly readable when back in the main sync context. This could help other people trying to achieve the same behavior (which is what I initially wanted). (CC @yairk @ezekiel25-17 if that works for you).
The idea is simply to always bind the execution context of a function that gets the AsyncLocalStorage store to the latest context that set the store value via enterWith
.
Let me know if you see anything that could go wrong with this method, any comments are welcome 🙂
const { AsyncLocalStorage, AsyncResource } = require('async_hooks');
const als = new AsyncLocalStorage();
/** Runs the main code and reads the state of the AsyncLocalStorage after execution */
async function main() {
console.log('main 1, store:', getLatestStore());
await doSomething();
console.log('main 2, store:', getLatestStore());
}
/** Our mutable function allowing to get the ALS store in the last context that it was set */
let getLatestStore = getStore;
function getStore() {
return als.getStore();
}
/**
* Sets the AsyncLocalStorage value with `enterWith` for the remainder of the current synchronous execution
* https://nodejs.org/api/async_context.html#asynclocalstorageenterwithstore
*/
async function prepareALS(value) {
als.enterWith(value);
// Set `getLatestStore` to return the store in the current context
getLatestStore = AsyncResource.bind(getStore);
}
/** Play with the store */
async function doSomething() {
console.log('doSomething 1, store:', getLatestStore());
await prepareALS(1);
console.log('doSomething 2, store:', getLatestStore());
await prepareALS(2);
console.log('doSomething 3, store:', getLatestStore());
await prepareALS(3);
console.log('doSomething 4, store:', getLatestStore());
}
main()
.then(() => console.log('done'))
.catch((err) => console.error(err));
Resulting logs:
main 1, store: undefined
doSomething 1, store: undefined
doSomething 2, store: 1
doSomething 3, store: 2
doSomething 4, store: 3
main 2, store: 3
done
Honestly speaking I don't know the use case here. Usually using enterWith
is a bad idea and run
should be used. enterWith
was intended for situations where it is not possible to wrap the relevant code and use run
.
There might be also situations where you really want to set something on local store for the remainder of the current synchronous execution but you should really know that only relevant code is running there (incl. node internals,...).
I assume in your final setup getLatestStore
will be instantiated once per request/task or whatever you execute instead once per application. Otherwise I would assume that parallel tasks overwrite getLatestStore
.
if not you could avoid using AsyncLocalStorage
and just store the value into this variable instead.
I also don't understand why prepareALS
is an async
function but maybe that's just to illustrate that doSomething
has to await
something.
I would recommend against using enterWith(...)
. There's a strongly-worded warning against its use in the docs for a good reason. There are lots of cases like this one where persisting until the end of the sync tick can be very confusing. It's much better to use the run(...)
function as it's more clearly scoped.
I filed https://github.com/nodejs/node/issues/45848 which is basically a dup of this issue, but there is one little wrinkle I don't see called out in the comments above. While I get that enterWith
is problematic and that inner async context shouldn't propagate to the outer async scope, what's odd is that sometimes it does, which I think trips people up when they try to use enterWith
in jest.
Consider these examples where the inner async context behavior changes depending on whether other stores are in play and if you use async/await syntax vs. promises:
const { AsyncLocalStorage } = require("async_hooks");
const als = new AsyncLocalStorage();
Promise.resolve()
.then(() => als.enterWith("inner nested async"))
.then(() => console.log(als.getStore())) // undefined 👍
console.log(als.getStore()); // undefined 👍
const { AsyncLocalStorage } = require("async_hooks");
const als = new AsyncLocalStorage();
const unrelatedAls = new AsyncLocalStorage();
unrelatedAls.run("unrelated", () => {}); // Could also be in `Promise.resolve().then(() => { ... });`
async function main() {
await Promise.resolve().then(() => als.enterWith("inner nested async"));
console.log(als.getStore()); // undefined 👍
}
main().then(() => console.log(als.getStore())); // undefined 👍
console.log(als.getStore()); // undefined 👍
const { AsyncLocalStorage } = require("async_hooks");
const als = new AsyncLocalStorage();
async function main() {
await Promise.resolve().then(() => als.enterWith("inner nested async"));
console.log(als.getStore()); // "inner nested async" 😱
}
main().then(() => console.log(als.getStore())); // undefined 👍
console.log(als.getStore()); // undefined 👍
Because the behavior changes depending on seemingly unrelated ALS stuff, in Jest land this can manifest itself as test suites that pass/fail depending on whether they are run in isolation.
It seems there has been no activity on this issue for a while, and it is being closed in 30 days. If you believe this issue should remain open, please leave a comment. If you need further assistance or have questions, you can also search for similar issues on Stack Overflow. Make sure to look at the README file for the most updated links.
It seems there has been no activity on this issue for a while, and it is being closed. If you believe this issue should remain open, please leave a comment. If you need further assistance or have questions, you can also search for similar issues on Stack Overflow. Make sure to look at the README file for the most updated links.
I've been playing with
AsyncLocalStorage
with the purpose of using it for propagating context across web requests. I had issues getting it to work in Jest-based unit-tests: specifically, stores set inbeforeEach
would not propagate to the tests. Ultimately I realized that the test's async context was created prior to thebeforeEach
hook being actually run.While trying to investigate this, I came up with the following code:
With some debug prints enabled (courtesy of async-local-storage), the above code gives the following output:
Am I missing something in expecting the store to propagate to the third promise as well?