nodejs / node

Node.js JavaScript runtime βœ¨πŸ’πŸš€βœ¨
https://nodejs.org
Other
105.7k stars 28.67k forks source link

AsyncLocalStorage: inconsistent propagation of nested async context when resolving outer promise #45848

Open jenseng opened 1 year ago

jenseng commented 1 year ago

Version

v18.12.1

Platform

Darwin nfml-jonj5MM 21.6.0 Darwin Kernel Version 21.6.0: Wed Aug 10 14:25:27 PDT 2022; root:xnu-8020.141.5~2/RELEASE_X86_64 x86_64

Subsystem

async_hooks

What steps will reproduce the bug?

Run this repro:

const { AsyncLocalStorage } = require("async_hooks");
const assert = require("assert");

const store = new AsyncLocalStorage();

const unrelatedStore = new AsyncLocalStorage();
// the test passes if you comment this out
unrelatedStore.enterWith("unrelated");

// oversimplification of jest internals πŸ˜…
async function testHarness() {
  // the test also passes if you enter synchronously, e.g. change to
  // `store.enterWith("value")` or `await new Promise((resolve) => resolve(store.enterWith("value")));`
  await Promise.resolve().then(() => store.enterWith("value"));
  assert.equal(store.getStore(), "value");
}
testHarness();

How often does it reproduce? Is there a required condition?

The test fails every time, unless you tweak it as indicated in the comments, in which case it always passes.

What is the expected behavior?

It should behave consistently independent of other ALSes. It should either always propagate the inner async context, or it never should.

What do you see instead?

node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

AssertionError [ERR_ASSERTION]: undefined == 'value'
    at testHarness (/Users/jonj/projects/jest-als-bug/src/simple-repro2.js:15:10) {
  generatedMessage: true,
  code: 'ERR_ASSERTION',
  actual: undefined,
  expected: 'value',
  operator: '=='
}

Additional information

This was originally reported as a Jest bug, but further debugging indicates it's an issue in Node.js. The basic summary is if you:

  1. Set an ALS store (either via enterWith or run) (e.g. in a test suite's beforeEach)
  2. Asynchronously set an ALS store in the course of resolving a promise (e.g. in another test suite's beforeEach)
  3. The second store will be undefined in any async operations that key off that promise (e.g. the test code). If you omit step 1 or make step 2 synchronous the store value will be available.

I think an argument could be made that this is (mostly) working as designed since the store is being set in an inner async context and shouldn't apply to other operations in the testHarness (see repro). But if that's the case, why does it propagate when there is no other ALS? Ideally the behavior should be consistent independent of the unrelated ALS, and the docs could probably be clarified a bit.

Here's a Jest repro that might be more representative of how an end-user could encounter this issue. Since Jest has no aroundEach functionality, end-users might be tempted to try to set a store along these lines, which could then encounter context loss:

const { AsyncLocalStorage } = require("async_hooks");
const store = new AsyncLocalStorage();

describe("store", () => {
  beforeEach(async () => {
    store.enterWith("value");
  });

  it("the store should be set", () => {
    // this passes, unless another store gets set 😬 (even a different ALS in another test suite!)
    expect(store.getStore()).toEqual("value");
  });
});

From a Jest end-user perspective the test runner code is a black box, so it would be nice if the ALS context propagated to those outer chained async operations. While there are workarounds like AsyncResource.bind(), they may not be ergonomic or practical for end-users. That said, I could see that kind of propagation being problematic and having unintended consequences, so it might be better to make it never work (i.e. even in the no-other-ALS case), and make the docs a little more clear.

jenseng commented 1 year ago

Based on my reading of https://github.com/nodejs/help/issues/3041, it sounds like the nested async context should not propagate to the outer async context. So it seems like:

  1. The (minor) bug here is that sometimes it does (see repro above), and
  2. It might also be good to update the ALS docs to drive these points home. The current wording feels too loose: [it] persists the store through any following asynchronous calls, which could reasonably be understood to include (outer) asynchronous calls triggered by subsequent synchronous promise fulfillment.
jenseng commented 1 year ago

Some more examples copied/adapted from this comment:

  1. βœ… Nested context does not propagate through outer promise resolution when using Promises
    const { AsyncLocalStorage } = require("async_hooks");
    const als = new AsyncLocalStorage();
    Promise.resolve()
      .then(() => als.enterWith("inner nested async"))
      .then(() => console.log(als.getStore())) // undefined πŸ‘ 
    console.log(als.getStore()); // undefined πŸ‘ 
  2. βœ… Nested context does not propagate through outer promise resolution when using async/await, so long as other ALS stuff has already happened
    const { AsyncLocalStorage } = require("async_hooks");
    const als = new AsyncLocalStorage();
    const unrelatedAls = new AsyncLocalStorage();
    unrelatedAls.run("unrelated", () => {}); // Could also be in `Promise.resolve().then(() => { ... });`
    async function main() {
      await Promise.resolve().then(() => als.enterWith("inner nested async"));
      console.log(als.getStore()); // undefined πŸ‘ 
    }
    main().then(() => console.log(als.getStore())); // undefined πŸ‘
    console.log(als.getStore()); // undefined πŸ‘ 
  3. ❌ Nested context does propagate through outer promise resolution when using async/await
    const { AsyncLocalStorage } = require("async_hooks");
    const als = new AsyncLocalStorage();
    async function main() {
      await Promise.resolve().then(() => als.enterWith("inner nested async"));
      console.log(als.getStore()); // "inner nested async" 😱
    }
    main().then(() => console.log(als.getStore())); // undefined πŸ‘
    console.log(als.getStore()); // undefined πŸ‘

Note that 2 and 3 are identical, except that 2 does some other ALS stuff beforehand. Also worth pointing out that behavior doesn't change with different async styles:

async function main() {
  await (async () => { await 42; als.enterWith("inner nested async"); })();
  console.log(als.getStore()); // "inner nested async" 😱
}

and even deeply nested async:

async function main() {
  await (async () => {
    await 42;
    await (async () => {
      await 42;
      await (async () => {
        await 42;
        als.enterWith("inner nested async");
      })();
    })();
  })();
  console.log(als.getStore()); // "inner nested async" 😱
}
jenseng commented 1 year ago

Demonstration of how it only works the very first time:

const { AsyncLocalStorage } = require("async_hooks");
const als = new AsyncLocalStorage();
async function main() {
  await Promise.resolve().then(() => als.enterWith("inner nested async"));
  console.log(als.getStore());
  await Promise.resolve().then(() => als.enterWith("something else"));
  console.log(als.getStore());
}
main().then(main);

Outputs:

inner nested async
inner nested async
undefined
undefined

If the desired behavior is propagation, then we should expect alternating inner nested async and something else. If the desired behavior is no propagation, then we should expect undefined every time.

Qard commented 1 year ago

You can blame async_hooks for that. Because AsyncLocalStorage doesn't enable async_hooks until run or enterWith is called, async_hooks will not be producing events until that point. Basically when the resource stack unwinds there will be nothing to unwind to so it will just leave it as-is because enterWith doesn't have an exit condition. This is one of the many reasons I recommend against using enterWith and have been working on a safer replacement.

jenseng commented 1 year ago

Thanks for the context! Is it correct to assume that "fixing" this enterWith issue might require enabling async_hooks unilaterally? I imagine that could be problematic, making this a wontfix candidate, though would love to see a little more clarity in the docs around enterWith gotchas like this.

Looking forward to the safer replacement! πŸ™Œ

Qard commented 1 year ago

The list of gotchas with enterWith is very long and probably not fully knowable, which is why we took the approach of just warning against its use and why I've been working on alternatives so we can deprecate it.

The alternative I'm working on for the currently known use cases for it, which is not immediately obvious, is in #44943. Look at the channel.runStores(...) method. That will allow disjointed code to correctly scope AsyncLocalStorage context data so long as the producing end wraps the desired scope in a channel.runStores(...) invocation. This fits the model required of AsyncLocalStorage for APMs but may not fit all possible use cases. If that doesn't fit yours I'd appreciate more detail as to why and what exactly you're trying to do.

Flarna commented 1 year ago

Maybe we should add an option to AsyncLocalStorage constructor to enable the async hooks at construction time? Or even do it on default?

Once AsyncLocalStorage is used once via enterWith() or run() the hooks are enabled and never disabled (except explicit calling disable() or exit().

Sure, AsyncHooks have overhead. But I assume that most applications instantiating AsyncLocalStorage use it frequently therefore the performance impact is only theoretical as actually the hooks are enabled all the time.

Qard commented 1 year ago

Probably sufficient to just enable the hooks in AsyncLocalStorage construction rather than run.

jenseng commented 1 year ago

This fits the model required of AsyncLocalStorage for APMs but may not fit all possible use cases. If that doesn't fit yours I'd appreciate more detail as to why and what exactly you're trying to do.

I think that could work for the jest case, though it's a smidge more verbose than just using (the problematic) ALS.enterWith(). That said, there are also other workarounds, so it's not a huge deal. E.g. the current approach I've taken is to create a testWithContext function that uses ALS.run() instead:

const contextStore = new AsyncLocalStorage();
function testWithContext(name, context, fn) {
  test(name, () => contextStore.run(context, fn));
}
//...
testWithContext("my test", "some context", () => {
  expect(contextStore.getStore()).toEqual("some context");
});