exceptionless / Exceptionless.JavaScript

Exceptionless JavaScript client
http://exceptionless.com
Apache License 2.0
59 stars 24 forks source link

Unhandled exceptions in Node.js #29

Closed frankebersoll closed 9 years ago

frankebersoll commented 9 years ago

Problem

When an unhandled exception occurs, by default, Node.js writes the exception to the console and exits the process.

As I understand the source code and infer from other Exceptionless clients, the exceptionless.node.js client should extend that behavior by submitting the exception before the process quits. It doesn't do that. In fact, manually submitting errors and log messages works as expected, as long as the process keeps running long enough to submit the queue. But as soon as an unhandled exception occurs, it will be ignored:

var client = require("./dist/exceptionless.node.js").ExceptionlessClient.default;

client.config.apiKey = "YOUR_KEY";
client.config.useDebugLogger();

client.submitLog("Test: Log");

throw new Error("Test: Error");

Running this code will neither submit the log message nor the unhandled exception.

Causes

Inspection of exceptionless.node.ts and DefaultEventQueue.ts show several problems:

  1. When exceptionless.node.js is loaded, it subscribes the UNCAUGHT_EXCEPTION and BEFORE_EXIT events on the global process object. The Node documentation states that those events are named in lower camel case (uncaughtException and beforeExit) instead. They have been named like that since the process class has been first introduced in Node.
  2. For some reason, the uncaught-handler doesn't include a call to queue.process while the beforeExit-handler does. But the beforeExit event will never be emitted in case of an unhandled exception:

    'beforeExit' is not emitted for conditions causing explicit termination, such as process.exit() or uncaught exceptions (...)

  3. If the uncaught-handler was registered correctly, in the event of an unhandled exception it would keep the application running, thereby leaving it in an undefined state. This is bad behavior according to the Node docs:

    If a listener is added for this exception, the default action (which is to print a stack trace and exit) will not occur.

    (...)

    An unhandled exception means your application - and by extension Node.js itself - is in an undefined state. Blindly resuming means anything could happen.

    (...)

    uncaughtException should be used to perform synchronous cleanup before shutting down the process. It is not safe to resume normal operation after uncaughtException. If you do use it, restart your application after every unhandled exception!

  4. queue.process is asynchronous, as it submits messages using http requests. Though, it has no means of signaling an empty queue. This would be needed for implementing a submit-exception-then-exit behavior.

    Discussion

In my opinion, exceptionless.js should assist the user in adding low-ceremony error logging to their applications while making reasonable choices for common use cases, as other Exceptionless clients already successfully do. Unhandled exceptions are a rather common use case, so what seems to be reasonable, unobtrusive behavior in the event of an unhandled exception?

Without reading any docs or sources, I would have expected requiring("exceptionless") and setting athe api key to enable the following:

  1. An unhandled exception occurs.
  2. The error is logged to the console.
  3. The exception is added to the submission queue.
  4. The submission queue gets processed. No other code is allowed to run in the meantime.
  5. Any errors in regard to submitting the queue also get logged to the console.
  6. The process exits with code 1.

This would be in accordance to the Node documentation by not letting any application code run after an unhandled exception occurs.

Step 4 in particular seems tricky to get right. In an asynchronous environment like Node, how can we process our queue while blocking any other code from execution? There is no http.requestSync.

Some libraries try to imitate synchronous behavior using child_process.spawnSync - this would require forwarding the currently queued messages and configuration as arguments, submitting the queue in the subprocess and returning any errors to the parent process.

Before I go ahead and mess with the codebase, I would appreciate any of your comments about this topic.

niemyjski commented 9 years ago
  1. Do you want to submit a pull request for this name change. I'm not sure why it was uppercased but we should support the current version of node uses.
  2. Yes, if the application will crash on uncaught exception and won't call beforeExit event then we should call process queue. But like you said, the app might close, but at least it will be kicked off and may be submitted, that's better than the possibility of nothing ever being submitted.
  3. Currently we are using an in memory storage for the queue, the idea was that we wanted to implement local storage or something like that. If we had that then we could log these exceptions and exit, then when the app restarts we could submit these error messages. But it would be nice to submit and then close. Basically in all of our clients we want to intercept these handlers, log it and then let the app do what ever it does without us interfering with it (crashes, keeps running etc..). We don't want to mess with the work flow.
  4. Yeah, it would be nice to process the queue sync and then exit in this case, but this client shares a lot of code with the js client and I'd really hate to special case this especially spawning child processes without the end user knowing about this. I'd almost say we should write a storage implementation so on app restart it's submitted, unless you have a better suggestion (we are all ears :D) and you sound very knowledgable about this. So thank you very much for helping out, it's greatly appreciated.
frankebersoll commented 9 years ago

Well, I got some time to look a little deeper into this:

Do you want to submit a pull request for this name change. I'm not sure why it was uppercased but we should support the current version of node uses.

I can easily do that, but then the event handler will kick in, doing what it currently does. Specifically, it will keep any faulted program running, which is clearly not what we want, as already discussed. So at first, we should figure out what actually should happen in the event handler.

Yes, if the application will crash on uncaught exception and won't call beforeExit event then we should call process queue. But like you said, the app might close, but at least it will be kicked off and may be submitted, that's better than the possibility of nothing ever being submitted.

We shouldn't use the beforeExit event at all, because it won't be called on explicit termination like process.exit() either. This would be unexpected by the developer: Sometimes it works, sometimes it doesn't.

But it would be nice to submit and then close. (...) Yeah, it would be nice to process the queue sync and then exit in this case, but this client shares a lot of code with the js client and I'd really hate to special case this especially spawning child processes without the end user knowing about this.

I'd try to implement the following:

  1. Extension of the submission component, so we have both asynchronous and synchronous submission methods.
  2. Extension of the queue component, so it will use either asynchronous submission (queue.process) or synchronous submission (queue.processSync).
  3. In the uncaughtException handler:
    1. Enqueue the exception.
    2. Check if this is the only uncaughtException handler. If yes, call process.exit(1).
  4. In the exit handler: Call queue.processSync.

This would achieve the following:

  1. Not modify default node behavior (exit on unhandled exception, don't exit if there is another handler).
  2. Try to submit any remaining queued items on exit (in a manner that is resilient to submission problems).
  3. Leave some room to use the solution on the browser side of things, too.

Why would we need this in the browser? I just figured that we're going to lose any queued events in the browser as soon as the user navigates away or closes the window. Using a synchronous XMLHttpRequest (i just learned about that) in the window.unload handler seems to be bad practice and won't work cross domain anyway. But there is this new navigator.sendBeacon API that could be used if it's available.

What do you think?

frankebersoll commented 9 years ago

Fixed with #30