DataDog / dd-trace-js

JavaScript APM Tracer
https://docs.datadoghq.com/tracing/
Other
654 stars 309 forks source link

[express] Add hooks for middleware #1716

Open BigsonLvrocha opened 2 years ago

BigsonLvrocha commented 2 years ago

There is only hooks for request spans, but not for middlewares I'd like to have that for middlewares as well

I can implement this with a pull request also Been thinking about an api like this:

tracer.use('express', {
  hooks: {
    middleware(span, error, req, res) {
      // code here
    }
  }
}
rochdev commented 2 years ago

What is your use case? I see what you're adding an error in there, are you trying to capture an error from one of your middleware?

BigsonLvrocha commented 2 years ago

Yes, the current implementation only uses the fields name, message and stack of the Error object for tagging the span, but I want to have more control on that by using any extra public fields a custom error might have.

I had a case where message was too generic and a better info on the error was in the field data but that was not logged in the trace or in the logs

rochdev commented 2 years ago

What is the data type of data? If we logged additional fields from error objects when available would that work for your use case?

BigsonLvrocha commented 2 years ago

It's string, if logged fields that are string, yes that would help :)

BigsonLvrocha commented 2 years ago

but for any fields, there is the problem where it could be anything, even circular references. I don't known how that would be solved

rochdev commented 2 years ago

That's already handled by the internal tagger which detects circular dependencies. However, for this use case specifically it will probably make more sense to only allow primitives because there are often complex objects like req attached to the error object which would be too verbose.

BigsonLvrocha commented 2 years ago

Yes, the option to allow only primitive is a good one, when it's an object it would be a string placeholder.

pinko-fowle commented 1 year ago

What is your use case? I see what you're adding an error in there, are you trying to capture an error from one of your middleware?

@rochdev In our use case, we are trying to isolate some infrequent memory explosions in our long node.js pipelines. We would love to have a middleware hook that would let us add the ability to capture memory before & after each hook.

Hooks around middleware seem like a pretty fine opportunity. We can write a higher-order function to add some work to our existing middlewares, but this would require changing a lot of code, in a number of affected projects. Just having a hook would be a great general capability for adding some finer instrumentation to our code, and would be a good tool for the dd-trace toolbox.

rochdev commented 1 year ago

@pinko-fowle This type of use case would probably better be served by hooks or events directly at the framework level, but it's very unlikely that this would ever land in Express at this point.

As for dd-trace, later on we'll have more flexible hooks in the form of diagnostics channel events to intercept any span when it's created or its state changes, but right now the closest to this would be our internal channels used for instrumentation. These are not public so there are no guarantees about their names or the data they receive, but for your use case it sounds like that wouldn't be too much of an issue.

For example:

const { channel } = require('diagnostics_channel')

const enterChannel = channel('apm:express:middleware:enter')
const exitChannel = channel('apm:express:middleware:exit')

enterChannel.subscribe({ req } => {
  // capture memory when a middleware is entered
})

exitChannel.subscribe({ req } => {
  // capture memory when a middleware is exited
})

Another alternative if you have profiling enabled in your account would be to use that instead, since it profiles heap allocations so you would be able to see what is allocating the memory that is being used for the time period of each profile. It also has additional tooling in the UI to debug performance issues with more granularity than tracing since tracing focuses more broadly on the entire distributed system and profiling looks at exactly what resources the code is using.