nodejs / node

Node.js JavaScript runtime ✨🐢🚀✨
https://nodejs.org
Other
105.69k stars 28.68k forks source link

Async_hooks use case #46278

Open orinatic opened 1 year ago

orinatic commented 1 year ago

Affected URL(s)

https://nodejs.org/api/async_hooks.html#async-hooks

Description of the problem

I need to track state for async contexts that I do not control.

Specifically, I am tracking the 'stack' through setTimeout.

I have overridden setTimeout to be (simplified):

const handler = {
    apply(target: Function, self: unknown, args: unknown[]) {
      const [func, ...rest] = args

      const st = getStackTrace()
      var wrapper = function(...args: unknown[]) {
        getCallStack().unshift(st)
        func.apply(this, ...args)
        getCallStack().pop()
      }

      return Reflect.apply(target, self, [wrapper, ...rest])
    }
  }
setTimeout = new Proxy(setTimeout, handler)

and getCallStack is (simplified)

import { createHook, executionAsyncId } from "async_hooks"
const callStacks: Map<number, string[]> = new Map()

function init() {
  const id = executionAsyncId()
  callStacks.set(0, [])

  function onThreadInit(id: number, {}, triggerId: number, {}) {
    const parentStack = callStacks.get(triggerId)
    if(!parentStack) {
      callStacks.set(id, [])
    } else {
      callStacks.set(id, [...parentStack])
    }
  }

  const hook = createHook({init: onThreadInit)
  hook.enable()
}
init()

export function getCallStack() {
  const id = executionAsyncId()
  return callStacks.get(id)
}

and getStackTrace uses Error to get a stacktrace.

Within several calls deep within the setTimeout-d function, further instrumentation can access getCallStack() to trace where the function was called from.

I do something similar with EventEmitters -- I want to know where an event was registered

The use-case for this is an instrumentation library -- I do not control any of the code other than the specific functions that I overwrite. Thus AsyncLocalStorage, which (as far as I can tell?) requires modifying the async calls, doesn't work for this use-case. Also, threads need to be able to access/get a copy of their parent's storage, which doesn't seem possible with AsyncLocalStorage either?

It is entirely possible I am missing something obvious, but so far async_hooks seems like the only way to do this.

Flarna commented 1 year ago

What could work is the onPropagate hook added a while ago to AsyncLocalStorage.

The main difference is that your init hook uses the triggerId as parent whereas AsyncLocalStorage uses the current executing resource as parent. In a lot cases it's the same but not in all. Don't know if you really need the triggerId here.

Flarna commented 1 year ago

fyi @nodejs/async_hooks

orinatic commented 1 year ago

I honestly have no idea about triggerID vs current id. It might work just as well?

I think the bigger problem is that (as far as I can tell and I may just be wrong), there's no way to use AsyncLocalStorage without actually modifying the async call locations?

I'm constrained in that I am writing a library which is imported by someone else's code. So I can't call asyncLocalStorage.run or anything like that. My hooks need to be transparent from the user point of view.

To add a bit more context, I'm instrumenting certain sensitive node API calls. So for instance, if a user schedules a task to create a file, I have a handler for fs.openFileSync. So if the code in question is:

  setTimeout(() => {fs.openFileSync(<path to file>)}, 100)

then I have a handler for openFileSync, and it needs a way to access a the stack saved in the setTimout handler.

I could just be misunderstanding how AsyncLocalStorage works, but it seems like there's no way to use it without editing the client code?

Flarna commented 1 year ago

AsyncLocalStorage uses AsyncHooks internally. If you have no need to touch the code now you should have no need to touch it afterwards.

But thinking once more about this I fear it's actually not working as propagateCb is only called if a store is active which is not the case for you.

So I fear there is currently no replacement for AsyncHooks.

Qard commented 1 year ago

Your use-case is basically that you want long stacktraces? Meaning stack traces that flow over the async barrier back to whatever was ultimately responsible for it eventually being triggered? There are some userland solutions to this, all of which depend on async_hooks internally, and technically V8 can do this over async/await. It's a use-case we're aware of but don't have a solution for yet.

orinatic commented 1 year ago

I'm not 100% sure on the terminology, but I think so, yes?

Specifically, I'm tracking sevent handlers, settimeout (and siblings) and process.on along with async/await

Can you elaborate any on the V8 solution, or is that just on async/await and not on the others?

On Wed, Mar 8, 2023, 9:59 AM Stephen Belanger @.***> wrote:

Your use-case is basically that you want long stacktraces? Meaning stack traces that flow over the async barrier back to whatever was ultimately responsible for it eventually being triggered? There are some userland solutions to this, all of which depend on async_hooks internally, and technically V8 can do this over async/await. It's a use-case we're aware of but don't have a solution for yet.

— Reply to this email directly, view it on GitHub https://github.com/nodejs/node/issues/46278#issuecomment-1459102510, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIJXO5VK6OFLQENDBUD74DW27KYFANCNFSM6AAAAAAUA2PQD4 . You are receiving this because you authored the thread.Message ID: @.***>