Dealing with HostReportError on the server

zenparsing commented 6 years ago

I am reluctant to bring this discussion up, but I'm having trouble updating some Node applications to use the current specification. The intention is not to re-litigate but to solicit feedback on how to address these issues.

The Use Case:

We have a Node.js HTTP application that integrates functionality from a bunch of third-party service providers. The functionality for each provider is implemented using a plugin system. Occasionally the plugins need to coordinate with each other, and for this purpose we use a multicast Observable as an event hub. Plugins can generate events and other plugins can listen for and respond to these events through the Observable interface.

We are currently using an older version of the specification, in which errors are propagated from consumer to producer. In our system, if a plugin throws an error from an observer, we catch that error, report it to NewRelic, and return a 500 error response to the client.

If we upgrade to the current version of the specification without any other changes to our codebase, an unexpected error from a plugin's observer will crash the process. This is not acceptable to us.

Possible Solutions:

We could wrap the bodies of all observer "next" functions in try-catch.
We could use an operator for listening and not use subscribe. This new operator would catch and report errors.
We could subclass and override subscribe to catch and report errors.

None of these options are very palatable.

The first option feels like we're introducing boilerplate that we ought to be able to abstract away. The second and third options feel like we're using operators and subclassing as workarounds, rather than extension points as intended.

Of course, we have a similar challenge with Promises. For promises, the rule is: either return the promise or use .catch. Most often we return the promise. The cases where we do not return the promise are pretty easy to spot and the call to .catch doesn't feel like boilerplate.

@benlesh I know that RxJS has implemented the HostReportError semantics. Has this issue come up? If so, what solutions are developers using? Is there something that I'm missing?

Thanks!

benlesh commented 6 years ago

The only people that should hit a problem are people that are trying to perform a try/catch around a subscription, but that's really an (sorry I hate this term) "anti-pattern". Most observables are async, and async doesn't really work with try/catch.

So far all I've seen is it solving crazy bugs people get while multicasting.

what solutions are developers using?

If people do hit this issue, in RxJS it's as simple as providing an error handler to the subscribe method. Then it's no longer unhandled, and you still have the opportunity to deal with it synchronously:

someSyncObservableThatBreaks.subscribe({
  next(value) { handleStuff(value); },
  error(err) {
    syncErrorThrown = true;
    syncError = err;
  }
});

if (syncErrorThrown) { /* handle syncError here */ }

benjamingr commented 6 years ago

@benlesh

Most observables are async, and async doesn't really work with try/catch.

Actually, most people using Node.js are using async/await and try/catch happily today - if you believe the recent surveys (I can dig those up if you'd like).

I am explicitly not saying whether or not I think it's a good or bad idea or bringing it up for discussion - I am just saying that that seems to be the case.

Even in that case, when something throws and that isn't caught (or awaited) - the process typically crashes and a heap dump is taken to figure the problem out (at least, well behaved ones).

It is actually very challenging to gracefully dealing with thrown errors - and I've only seen a handful of cases where this was handled correctly.

@zenparsing

In our system, if a plugin throws an error from an observer, we catch that error, report it to NewRelic, and return a 500 error response to the client.

In general, in Node.js your async code (except when using promises or async/await) should never throw errors. Your observables should error (gracefully) and you can .catch those. See https://github.com/tc39/proposal-observable/issues/177 for code examples.

That said - I see why it would be very challenging to migrate a code base that wasn't built with that expectation.

I think it is unreasonable to expect observables to deal with this since things like setTimeout or callbacks don't.

zenparsing commented 6 years ago

@benjamingr So basically, either wrap my handlers with try-catch or don't use subscribe.

I think it is unreasonable to expect observables to deal with this since things like setTimeout or callbacks don't.

Counter-example: EventEmitter.

Anyway, thanks!

zenparsing commented 6 years ago

FWIW, I've reverted this behavior from zen-observable, since it does not work well for these use cases.

benlesh commented 6 years ago

The use case being you want to wrap subscribe in a try-catch? Because you want synchronous error handling?

Just trying to understand.

I think @benjamingr may have misunderstood what I was saying above.

zenparsing commented 6 years ago

Essentially, we need to create something like EventEmitter, where the emitter can send data to a bunch of listeners and can catch any unhandled exceptions thrown by those listeners.

With error swallowing we can't implement such a thing. At this point, we prefer to simply not match the spec, over the other alternatives.

zenparsing commented 6 years ago

I'm going to reopen this as I think there is still a deep design controversy here that was never resolved.

benlesh commented 6 years ago

It seems like this issue is trying to enable this:

try {
  source.subscribe({
    next(v) { throw new Error('oops'); }
 })
} catch (err) {
  console.log('error handled');
}

If an author is writing a final observer they think is going to fail, they should use forEach and handle errors in the subsequent promise.

source.forEach(v => { throw new Error('oops'); })
  .catch(err => console.log('error handled'));

In RxJS, this is never a problem, because people just use do (aka tap) in that case;

 source.do({
    next(v) { throw new Error('oops'); }
 })
 .subscribe({
   error(err) { console.log('error handled'); }
 });

try/catch is just a mechanism that doesn't lend itself well to reactive programming, outside of async/await functions, where really it's just sugar around Promise chaining, not a "real try/catch" (whatever that means, haha)

benlesh commented 6 years ago

To take this further, this is also an anti-pattern, because source could be sync or (more likely) async:

try {
  source.map(v => { throw new Error('bad'); })
    .subscribe({ next(v) { console.logI(v); } });
} catch (err) {
  console.log('error handled');
}

... in the async case, the try/catch is useless.

The proper way to do this would be to use the error callback in subscribe:

 source.map(v => { throw new Error('bad'); })
    .subscribe({
      next(v) { console.logI(v); },
      error(err) { console.log('error handled'); }, 
    });

zenparsing commented 6 years ago

@benlesh That's not quite the issue that I have in mind.

Consider this scenario:

class Emitter extends Observable {

  constructor() {
    super(sink => {
      this.listeners.add(sink);
      return () => this.listeners.delete(sink);
    });
    this.listeners = new Set();
  }

  emit(event) {
    this.listeners.forEach(sink => {
      try {
        sink.next(event);
      } catch (err) {
        logError('An error occurred in an event listener', err);
      }
    });
  }

}

The idea is that the emitter can catch and deal gracefully with errors thrown from listeners.

With the current semantics, this doesn't work because errors thrown from observers can't be caught. They crash the program.

The solutions, assuming the current behavior, are listed in the OP. But they don't feel right to me: essentially they are all saying that for some use cases you shouldn't use subscribe to... subscribe. That message doesn't seem right for a language-level primitive.

The problem expresses itself concretely in the use case from the OP. More abstractly, the current semantics violates JavaScript's core exception propagation semantics: callers can catch exceptions thrown by callees.

Please note that promises don't violate this rule: the relationship between a promise producer and promise consumer is not caller-callee.

benlesh commented 6 years ago

Observable is designed to set up an relationship between a producer and a consumer that supports reactive programming. The Emitter shown above isn't designed well for reactive programming because the producer is trying to handle errors thrown by the consumer. it should be the other way around. The notification flow should always be unidirectional.

zenparsing commented 6 years ago

isn't designed well

Sigh...

benlesh commented 6 years ago

isn't designed well for reactive programming

You misunderstand. I see what you're trying to do, but maybe Observable isn't the right thing to build it on top of.

benlesh commented 6 years ago

I guess the risk is, if we move back to the behavior we still have in RxJS 5, there are some nasty bugs it can create with multicasting due to producer interference. (Basically, sync thrown unhandled errors unwind the stack and kill mutlicasts).. that is, unless we force producers to do error handling for their consumers as you're doing in your Emitter impl. But what would we do with it in a generic sense? Add some specialized error handler to Subjects in Rx? We technically already have error handling on Subjects via subscription. :thinking: I'm just not sure what we would do to accommodate this. I just know that the current behavior causes some very spooky bugs that are hard for people to figure out, mostly because of the somewhat hidden bidirectional notification of errors if they're unhandled.

zenparsing commented 6 years ago

@benlesh

I see what you're trying to do, but maybe Observable isn't the right thing to build it on top of.

Maybe that's it. I guess what I've been after all along (and what I've internalized as design goals for a push-based async primitive in JS) is a little different than "Observable" as developed by the Rx community.

This round of discussion has been helpful, actually. Thank you!

benlesh commented 6 years ago

@zenparsing it's definitely an observable, though... FWIW: RxJS <5 (read: all current production versions of RxJS) has the same semantic you're after. I'd be 100% okay with it, but it causes some crazy issues with producer interference that are hard to design around.

So maybe that's where I need to be sold. Like, how would a make a Subject in RxJS that avoided producer interference, but didn't schedule anything? HostReportErrors seemed like what we should've been doing all along when I saw it as a solution. But perhaps I just need to see some alternatives.

zenparsing commented 6 years ago

Can you give me a concrete example of producer interference for Subject?

benlesh commented 6 years ago

Sure... Let's assume there is a service written by Team1, and simultaneously consumed by Team2 and Team3 in different components in the same application:

module from team1 to be consumed by other teams' code

const subject = new Subject();

const data = new Observable(observer => {
  let i = 0;
  const id = setInterval(() => observer.next(i++), 1000);
  return () => clearInterval(id);
});

data.subscribe(subject);

export const ticks = subject.asObservable();

team2's module in the clock component

import { ticks } from 'team1/ticks';

const clock = document.querySelector('.clock');

ticks.map(() => new Date().toString())
  .subscribe({
    next: datetime => clock.innerText = datetime
  });

team3's ad rotator component

import { ticks } from 'team1/ticks';

const adspace = document.querySelector('.adspace');

const ads = [
  { url: '/store?id=123', image: 'ad1.png' },
  { url: '/store?id=454', image: 'ad2.png' },
  { url: '/store?id=983', image: 'ad3.png' },
]

ticks  // one second ticks
  .filter((_, i) => i % 10 === 0) // every ten seconds
  .map((_, i) => I) // get a count of ticks
  .map(n => ads[n]) // team 3 *thinks* this is going to work.. :(
  .subscribe({ 
    next: ad => {
      adspace.src = ad.image;  // eventually we get a "Cannot read property 'image of undefined'"
    }
  })

In the above code, the bug in team 3's ad rotator will synchronously throw and unwind the stack back to the Subject, resulting in the subject no longer notifying team 2's clock component. Now team 2 thinks they have a bug, but they can't figure out what it is or where it comes from. It seems to be some crazy heisenbug.

... Imagine this only with much more complicated applications. I've seen it several times, actually. And the error is always hard to track down. Sometimes you're looking at a multicast off of a multicast off of a multicast.

zenparsing commented 6 years ago

@benlesh Thanks! This clarifies the issue.

Would it be enough to modify Subject such that it catches errors from the observer and does whatever it wants to do with them (e.g. re-throwing them in a new timeout)?

Then perhaps Subject could be parameterized on its error handling behavior. On the browser it would probably make the most sense to re-throw in a timeout, but on the server it might make sense to log errors to Winston or something.

zenparsing commented 6 years ago

@benlesh What are your thoughts on the suggestion above?

benlesh commented 6 years ago

I'm not exactly sure what the implications would be if we did this in RxJS. I do know that it would mean the behavior of Observable and Subject would be inconsistent with each other. Observables would always rethrow unhandled errors synchronously, unless they were Subjects or they were downstream from a Subject.

I think a deeper issue here is probably whether or not this sort of type should fundamentally allow two-way communication of errors (but only in synchronous cases?).

zenparsing commented 6 years ago

@benlesh But it would solve the application-level issue that you describe above, correct?

Observables would always rethrow unhandled errors synchronously, unless they were Subjects or they were downstream from a Subject.

Up to now, we've always had the situation where an observer might nor might not throw an error (it was not possible for the producer to know ahead of time). Are there any other concrete problems that this has caused, besides the multicasting consumer-interference issue already mentioned above?

I think a deeper issue here is probably whether or not this sort of type should fundamentally allow two-way communication of errors (but only in synchronous cases?).

I prefer to stick to concrete arguments, but I will say that whatever we do, the type should not fundamentally break JS's callee/caller error propagation semantics in the base (unicast) case.

benlesh commented 6 years ago

Are there any other concrete problems that this has caused, besides the multicasting consumer-interference issue already mentioned above?

There is a wicked bug I've seen with Promises and Observable. If someone writes the most obvious promise conversion code like so (or any Observable that wraps a promise, really):

const fromPromise = (promise) => new Observable(observer => {
  promise.then(value => {
    observer.next(value);
    observer.complete();
  }, err => observer.error(err))
});

And then consumes it like so:

fromPromise(somePromise).map(value => { throw new Error(); })
  .subscribe(x => console.log(x))

The error gets swallowed by the promise error trapping behavior. Depending on the promise impl, that means sometimes it's silently unreported. (Obviously, this is less of a problem with modern Promise impls).

But it illustrates another weird problem, where now the error handling in a third party data producer is beholden to downstream actions such as mapping and filtering. If the error unwinds back to the Observable implementation, you can't really assume any specific error type, and you'd have to know to much about how your observable is being consumed in order to examine the error to make decisions about what to do with it. So if it unwinds back to an observable implementation and you catch it there are only three things you can do with it; 1. Swallow it. 2. Schedule it to be thrown. 3. Let it throw (unless you want to introduce a side-effect, which doesn't seem right for the type)

zenparsing commented 6 years ago

@benlesh I suppose a correct implementation would be:

const fromPromise = (promise) => new Observable(observer => {
  observer.then(
    value => observer.next(value),
    error => observer.error(error)
  ).finally(
    () => observer.complete()
  );
});

I see your point. It's easy to forget to deal with the case where an observer's method might throw. And in any such case, the producer won't really know what to do with it other than report it or log it.

On the other hand, having an error in an event listener always crash the program, without allowing any way for a producer to decide to do something else, seems like a different kind of footgun.

Thanks!

zenparsing commented 6 years ago

I left off a very important option for dealing with the use case in the OP: forEach. Instead of wrapping listener function bodies in try-catch, the consumer can use catch on the promise resulting from forEach:

events.on('something').forEach(event => {
  // Event handler code
}).catch(err => {
  // Log the error to a logging service
});

Additionally, an Observable subclass can override forEach to automatically do the catch thing above:

class LoggingObservable extends Observable {
  forEach(fn) {
    return super.forEach(fn).catch(err => {
      // Log the error
    });
  }
}

I think this is a pretty reasonable solution.

zenparsing commented 6 years ago

After seeing the forEach solution, I'm supportive of the HostReportErrors behavior in general.

There's one detail that I'm a little curious about: according to the current spec, any errors that occur within the producer's cleanup function are also swallowed:

let observer;

let subscription = new Observable(o => {
  observer = o;
  return () => { throw new Error('oops') };
}).subscribe();

// Should we swallow the cleanup error?
observer.complete();

This behavior seems to go beyond the "idealized observer" behavior that we are trying to create by swallowing errors. It's the producer's error here, not the consumer's. It seems swallowing those errors would make it hard to develop and test the producer code.

benlesh commented 6 years ago

@zenparsing One important thing about Observables is that guaranteed teardown. If you have a bunch of operators, you're basically chaining together observable subscriptions, of one of them in the middle throws an error during teardown, then the remaining teardown will not be called, and resources will remain allocated in limbo. Therefor you need to either: 1) differ throwing the error(s) until all teardown has completed. Or 2) Just schedule the throwing of the error (HostReportError) and continue merrily tearing down subscriptions.

zenparsing commented 6 years ago

@benlesh I'm not suggesting that subscription.unsubscribe() be allowed to throw. If we swallow errors from unsubscribe, then cascading teardown will work just fine.

I'm wondering whether subscriptionObserver.complete and subscriptionObserver.error should be allowed to throw if the cleanup function throws. I'm not sure that it makes sense to allow the producer to continue forward progress if the producer throws an error.

zenparsing commented 6 years ago

I think I'm now satisfied with the swallowing behavior.

There are a couple of points that ended up convincing me:

A consumer can ergonomically handle its own errors by using forEach.
The swallowing behavior is enforcing a more strict contract between the producer and consumer. The stricter contract makes it easier to develop producer code.
Since Observable is typically for async anyway, developers already have to deal with HostReportErrors-like situations.

Thanks for all the help!

tc39 / proposal-observable

Dealing with HostReportError on the server #178