getsentry / sentry-javascript

Official Sentry SDKs for JavaScript
https://sentry.io
MIT License
7.97k stars 1.57k forks source link

Node: registering `requestHandler` changes error handling behaviour #1922

Closed OliverJAsh closed 2 years ago

OliverJAsh commented 5 years ago

Package + Version

Version:

4.6.3

Description

When I enable Sentry.Handlers.requestHandler() (in Express), asynchronous errors will be:

It took me a very long time to understand why an async error inside a request handler would lead to different behaviour in development vs production.

In development the error would cause the process to die/exit (default behaviour in Node), whereas in production the error would be passed on to my error handling middleware, and an error response would be sent. It turned out this was because we were only enabling the request handler in production.

I would like to question whether "passing the error to error handling middleware (by calling next and passing the error )" is really the job of Sentry.Handlers.requestHandler(). Perhaps it should only report the error? After all, that's the main reason to enable Sentry in an application. This way, conditionally enabling Sentry (e.g. depending on env) would not lead to any other unexpected behaviour changes, such as the one I described above.

Alternatively, if we think it is the right thing to do, we should document it clearly, along with a suggestion for how to achieve consistent behaviour when the request handler is not enabled (e.g. by using domains like how Sentry.Handlers.requestHandler() does it).

kamilogorek commented 5 years ago

I would like to question whether "passing the error to error handling middleware (by calling next and passing the error )" is really the job of Sentry.Handlers.requestHandler(). Perhaps it should only report the error?

requestHandler doesn't send reports, it only creates a domain which is used in order to separate context between the requests.

It's a job of errorHandler to take that context, the error and make an event out of them which then can be sent to Sentry.

I'm not sure if I follow along with how it changes the behavior based on the environment. Can you elaborate more? Preferably with code examples? Thanks!

OliverJAsh commented 5 years ago

I'm not sure if I follow along with how it changes the behavior based on the environment. Can you elaborate more? Preferably with code examples? Thanks!

Does this help?

// ./index.js
const express = require('express')
const Sentry = require('@sentry/node')

const app = express();

// Comment this out to disable
app.use(Sentry.Handlers.requestHandler());

app.get('*', () => {
    setTimeout(() => {
        throw new Error('foo')
    })
})

app.listen(3000);
node ./index.js
curl localhost:3000

When the requestHandler is:

OliverJAsh commented 5 years ago

My workaround for now is to always enable requestHandler, and to conditionally enable Sentry, optionally provide the Sentry DSN + enable option. This at least gives us consistent behaviour.

HazAT commented 5 years ago

I will close this issue for now since the specific case seems to be solved, feel free to post again.

OliverJAsh commented 5 years ago

@HazAT I didn't see any response to my last comment, so I don't think this is solved? If so can we re-open?

OliverJAsh commented 5 years ago

Currently, when the domain receives an error, we forward the error to Express' error handling middleware by calling next with the error as an argument:

https://github.com/getsentry/sentry-javascript/blob/913eafc47c7c9ee3bb481615c5ad4016707ce926/packages/node/src/handlers.ts#L236

I think we should change this to only report the error to Sentry:

local.on('error', error => { Sentry.captureException(error); });

This will prevent the requestHandler from modifying Express' behaviour when an error occurs (which shouldn't be a concern for requestHandler), as I described in https://github.com/getsentry/sentry-javascript/issues/1922#issuecomment-466376706.

HazAT commented 5 years ago

Sorry, I was skimming over some issues, I will reopen this.

gaastonsr commented 4 years ago

Looks like local.on('error', next) is unneeded and actually changes the behavior of Express.

kamilogorek commented 4 years ago

actually changes the behavior of Express

How so? What is and what should be the behavior here?

gaastonsr commented 4 years ago

I just tested this locally so I make sure I can repro the problem.

The issue here is that when an error is thrown in a route handler either synchronously or asynchronously, or a rejection unhandled error is raised or really any error generated inside the domain, it shouldn't be caught and passed to next by Sentry, it should bubble up as it normally would.

The issue is that local.on('error', next) is a catch-all for any error raised inside the domain and it changes the default behavior by Express.

kamilogorek commented 4 years ago

it shouldn't be caught and passed to next by Sentry, it should bubble up as it normally would.

As long as there's a registered error handler in express, it'll always be passed to the next call and captured by the first registered middleware that has 4 arguments.

The most basic example:

const app = express()
app.get("/", (req, res) => { throw new Error("wat"); })
app.use((err, req, res, next) => console.log("error handler called"))
app.listen(3000)

When you run the server and ping it, you'll see that the error was still passed to the handlers that were registered after the route itself.

OliverJAsh commented 4 years ago

@kamilogorek What's the status of this issue? Looking at my comments above it seems like the issue I described still stands, although I didn't receive any acknowledgement so I'm not sure whether my comment was clear?

To reiterate, the problem is that async errors are not usually automatically passed to the express error handler—Sentry changes that behaviour, which is unexpected.

kamilogorek commented 4 years ago

@OliverJAsh I tested async version now and I'm not sure if you have something else on your mind or I cannot simply reproduce it.

var app = express();
app.use(Sentry.Handlers.requestHandler());
app.get("/sync", (req, res) => {
  throw new Error("wat");
});
app.get("/async", async (req, res) => {
  throw new Error("wat");
});
app.use(Sentry.Handlers.errorHandler());
app.use((err, req, res, next) => console.log("error handler called"));
app.listen(3000);

In this configuration console.log will be called for /sync endpoint but not for /async. Error will be reported to Sentry for both endpoints.

var app = express();
app.get("/sync", (req, res) => {
  throw new Error("wat");
});
app.get("/async", async (req, res) => {
  throw new Error("wat");
});
app.use((err, req, res, next) => console.log("error handler called"));
app.listen(3000);

In this scenario it'll behave the same in terms of logging, but only /async will be reported (it bubbles to global handler) and /sync will not, as it's captured by express middleware.

OliverJAsh commented 4 years ago

@kamilogorek Your example uses promise rejections rather than async exceptions. Replace:

app.get("/async", async (req, res) => {
  throw new Error("wat");
});

with

app.get("/async", (req, res) => {
  setTimeout(() => {
    throw new Error("wat");
  });
});

I provided a full test case here: https://github.com/getsentry/sentry-javascript/issues/1922#issuecomment-466376706

gaastonsr commented 4 years ago

@kamilogorek let me explain a little bit more. In the case or synchronous error, the behavior doesn't change but there is also no reason why requestHandler should intercept the error and pass it to next, in my opinion, Express should be the one to do that.

In the case of asynchronous errors, the behavior is changed, try with the example that @OliverJAsh provided:

app.get("/async", (req, res) => {
  setTimeout(() => {
    throw new Error("wat");
  });
});

The exception instead of bubbling up to the uncaughtException error handler, it is caught by requestHandler leaving the process alive when most likely it should have been exited.

kamilogorek commented 4 years ago

Verified it and it's indeed the case that timers are captured under the domain umbrella. And I also agree that we should rethink it. However, I'm off for a long vacation in few hours and don't want to make such a behaviour change ad-hoc, as there's always something that can go wrong, even with the simplest changes. I'll make sure to take a look at it once I'm back. Thanks for keeping pushing this issue, as I didn't understood it at first.

qubyte commented 3 years ago

I've just hit this issue and it was a real pain to work out that an error I was throwing in a setImmediate was being captured by a domain. Has there been any movement on this?

LewisJEllis commented 2 years ago

I just ran into this today. This:

local.on('error', next); 

bit should not send the error to next.

You can see here in the old raven-node library that I sent domain errors to the global uncaught exception handler: https://github.com/getsentry/raven-node/blame/master/lib/client.js#L442

but in the raven-node to sentry-javascript port, it was changed to send it to express's next callback instead: https://github.com/getsentry/sentry-javascript/blame/master/packages/node/src/handlers.ts#L156

It is certainly a mistake to send this to the express next callback - that is the root of the behavior difference observed here.

Ideally sentry's node SDK should move away from domains entirely (various other issues are open for that discussion #4633 #3660 etc), but for a quicker fix I would send it to a generic global uncaught exception handler like in onuncaughtexception.ts (like what raven-node was doing before), or maybe just capture it like @OliverJAsh suggests:

local.on('error', error => { Sentry.captureException(error); });

Or just remove the local.on('error', <whatever>) event handler entirely, and errors should then naturally find their own way to the global uncaughtException handler (assuming one is in place).

lobsterkatie commented 2 years ago

Hi, @LewisJEllis.

Thanks for the info (and a decent laugh at the tweet). We're discussing which option we want to go with, and will need to do a little testing before we settle on an approach. But it is back on our radar! I'll update here once we've made a decision.

mohd-akram commented 2 years ago

I stumbled onto this as well and opened #5627. It's not fun when my error logging library is actually breaking my app.

trevorsmith commented 2 years ago

I don't have very deep knowledge of Domain, but now that you're not attaching an error listener is the domain necessary?

We too were (unintentionally) relying on the previous behavior. Our server process started crashing whenever a GraphQLError was thrown, despite it being handled by Apollo. After a couple hours of debugging our entire stack, I discovered the GraphQLError itself wasn't the issue, an unhandled promise rejection from one of our middleware functions was.

Debugging took a lot longer than it should have because when an unhandled rejection hits the domain in this handler, the domain immediately calls the global uncaughtException handler, not the unhandledRejection handler. As a result we didn't have any idea this was related to promises, nor did we get access to promise object from unhandledRejection.

lobsterkatie commented 2 years ago

@trevorsmith - We use domains in order to isolate scope between requests. We will, at some point, move to using async hooks for this purpose, but in the meantime, the domains have to stick around.