DataDog / dd-trace-js

JavaScript APM Tracer
https://docs.datadoghq.com/tracing/
Other
636 stars 301 forks source link

Give a way to handle gracefully errors from dd tracer #1564

Open arturkasperek opened 3 years ago

arturkasperek commented 3 years ago

I'm using http plugin for tracing things in my server application. I noticed that server is pretty often restarted due to dd-trace errors like:

I was thinking about 2 solutions: First - error handler:

...
tracer.use('http', {
  service: 'service-name',
}).error(() => {
  console.log('DD have some problems')
});
...

Second - use error class hierarchy:

process.on('uncaughtException', function(error) {
  if ( error instanceof DDError ) {
    console.log('DD have some problems')
  }
  ...
});

I would be thankful for any solutions of my problem, Thank you :-)

rochdev commented 3 years ago

We already catch errors from dd-trace and send them to the debug logs, so it should never happen that we throw. If it does happen then there is a bug. It's worth noting however that dd-trace might show up in many stack traces since we hook into a lot of things, so it's also possible that you are having an existing issue that looks like it comes from dd-trace because of the stack trace even if that's not the case.

Can you share more information about these errors and stack traces that we could look at?

arturkasperek commented 3 years ago

@rochdev here you go:

/usr/src/app/node_modules/dd-trace/packages/dd-trace/src/scope/base.js:18
      throw e
      ^
Error: read ECONNRESET
    at TCP.onStreamRead (internal/stream_base_commons.js:209:20)
    at TCP.callbackTrampoline (internal/async_hooks.js:126:14)
Emitted 'error' event on ClientRequest instance at:
    at ClientRequest.req.emit (/usr/src/app/node_modules/dd-trace/packages/datadog-plugin-http/src/client.js:106:21)
    at Socket.socketErrorListener (_http_client.js:427:9)
    at /usr/src/app/node_modules/dd-trace/packages/dd-trace/src/scope/base.js:54:19
    at Scope._activate (/usr/src/app/node_modules/dd-trace/packages/dd-trace/src/scope/async_resource.js:53:14)
    at Scope.activate (/usr/src/app/node_modules/dd-trace/packages/dd-trace/src/scope/base.js:12:19)
    at Socket.bound (/usr/src/app/node_modules/dd-trace/packages/dd-trace/src/scope/base.js:53:20)
    at Socket.emit (events.js:314:20)
    at emitErrorNT (internal/streams/destroy.js:92:8)
    at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)
    at processTicksAndRejections (internal/process/task_queues.js:84:21) {
  errno: 'ECONNRESET',
  code: 'ECONNRESET',
  syscall: 'read'
}
/usr/src/app/node_modules/dd-trace/packages/dd-trace/src/scope/base.js:18
      throw e
      ^
TypeError [ERR_INVALID_URL]: Invalid URL: /\
    at onParseError (internal/url.js:257:9)
    at parse (<anonymous>)
    at new URL (internal/url.js:333:5)
    at Server.<anonymous> (/usr/src/app/src/index.js:189:18)
    at Server.emit (events.js:314:20)
    at /usr/src/app/node_modules/dd-trace/packages/datadog-plugin-http/src/server.js:13:23
    at /usr/src/app/node_modules/dd-trace/packages/dd-trace/src/plugins/util/web.js:74:60
    at Scope._activate (/usr/src/app/node_modules/dd-trace/packages/dd-trace/src/scope/async_resource.js:53:14)
    at Scope.activate (/usr/src/app/node_modules/dd-trace/packages/dd-trace/src/scope/base.js:12:19)
    at Object.instrument (/usr/src/app/node_modules/dd-trace/packages/dd-trace/src/plugins/util/web.js:74:39) {
  input: '/\\',
  code: 'ERR_INVALID_URL'
}
Zrzut ekranu 2021-08-10 o 15 31 35
rochdev commented 3 years ago

For the first and the last one, they should be existing errors that dd-trace is only involved in because the scope manager is present in all asynchronous operations, and we instrument the http module so either the http server or client plugin will show up in all requests. Basically, it looks like these are existing errors that would still occur regardless of the tracer. However, if it doesn't happen when the tracer is disabled, then it's possible that somehow an existing error that should have been caught isn't because of the tracer which would be a bug.

For the second one it's a bit more complicated. It looks like it could be the way we parse the route, which is surprising because we use the same logic as Express. It's possible we do things in a different way than your specific version of Express though which could cause this issue.

Are these errors something you are able to easily reproduce with a snippet you could share?

rdsedmundo commented 1 year ago

I just experienced something similar and was in need of the same. I'd rather have the instrumentation/monitoring not to work than to crash a legitimate user-facing request as it happened.

[dd.trace_id=1906380973951561180 dd.span_id=4729213299456086151] error: Sentry.captureException: fetch failed TypeError: fetch failed
    at Object.fetch (node:internal/deps/undici/undici:11457:11) {
  cause: SocketError: other side closed
      at TLSSocket.onSocketEnd (node:internal/deps/undici/undici:9689:26)
      at TLSSocket.emit (node:events:525:35)
      at TLSSocket.emit (node:domain:489:12)
      at TLSSocket.emit (/opt/nodejs/node_modules/dd-trace/packages/datadog-instrumentations/src/net.js:61:25)
      at endReadableNT (node:internal/streams/readable:1359:12)
      at processTicksAndRejections (node:internal/process/task_queues:82:21) {
    code: 'UND_ERR_SOCKET',
    socket: {
      localAddress: '169.254.76.1',
      localPort: 54930,
      remoteAddress: undefined,
      remotePort: undefined,
      remoteFamily: undefined,
      timeout: undefined,
      bytesWritten: 1195,
      bytesRead: 1894
    }
  }
}
tlhunter commented 9 months ago

@arturkasperek is this still an issue for you?

eXigentCoder commented 8 months ago

I just got the same issue with "dd-trace": "^2.12.2", : image

Going to upgrade to the latest version to see if it's fixed, I see I'm far behind

natethelen commented 1 month ago

We have a similar need because of a failure in dd-trace to parse our large json post bodies: #4594