WICG / observable

Observable API proposal
https://wicg.github.io/observable/
Other
563 stars 13 forks source link

Is there any concern about the difference in constructor semantics from `Promise`? #22

Closed tbondwilkinson closed 8 months ago

tbondwilkinson commented 1 year ago

To preface, I understand that RxJS and other Observable implementations are widely used and very popular, and that a lot of people have been very successfully using Observables for very long time in many applications.

But there's a difference between using a library and using a web platform primitive. And I do worry a little that people will expect the Observable constructor to behave similarly to the Promise constructor, in particular being executed synchronously and only once for every .then() callback. Observables don't really work that way, as the callback parameter is called when subscribe is called, an unlimited number of times. This is the much discussed cold vs hot distinction.

So for example, maybe it's really easy to write:

const observable = new Observable((subscriber) => {
  subsriber.next(someReallyExpensiveComputation());
});

And people comfortable with promises will think that's only going to be executed once and then each subscriber gets the expensive value, when in reality the expensive computation happens once for every subscriber.

This is less of a concern for Observables that the platform creates for something like event listeners, where the browser is in charge of the observable instantiation and the rest of the API is just really offering a better listener pattern.

benlesh commented 1 year ago

I do worry a little that people will expect the Observable constructor to behave similarly to the Promise constructor, in particular being executed synchronously and only once for every .then() callback.

That cat is sort of already out the the bag for more than a decade now. It's a non-issue in terms of broad understanding of the type. I know that's dismissive, so I'll try to elaborate.

However, to play devil's advocate, there are other APIs that don't quite line up with what the Promise constructor does despite having a similar "look", for example MutationObserver or ResizeObserver:

const resizeObserver = new ResizeObserver((entries) => {
  // do stuff
});

resizeObserver.observe(element);

resizeObserver.disconnect();

const mutationObserver = new MutationObserver((entries) => {
  // do stuff
});

mutationObserver .observe(element);

resizeObserver.disconnect();

And honestly, those two APIs would have been so much better off as observables, because as they're currently implemented, while you can unobserve different elements, if you observe two different elements because you want to do something similar with each, you end up needing to add conditionals in the callback to handle them in whatever slightly different ways you might need to. As opposed to something like new MutationObservable(element).subscribe(callback) or the like.


In terms of unicast/multicast, which is this issue:

const expensiveThingObservable = new Observable((subscriber) => {
  subscriber.next(someReallyExpensiveComputation());
});

This is already something people face with ordinary functions:

const getExpensiveThing = (callback) => {
  callback(someReallyExpensiveComputation())
}

And the solution is exactly the same:

let firstTry = true
let expensiveThing = null
const getExpensiveThing = (callback) => {
  if (firstTry) {
    firstTry = false;
    expensiveThing = someReallyExpensiveComputation();
  }
  return expensiveThing;
}

Again, identical with an observable:

let firstTry = true
let expensiveThing = null
const expensiveThingObservable = new Observable((subscriber) => {
  if (firstTry) {
    firstTry = false;
    expensiveThing = someReallyExpensiveComputation();
  }
  subscriber.next(expensiveThing);
  subscriber.complete();
});

Now when it's asynchronous, it's just a little different:

let waiting = true
let expensiveThing = null
let deferred = new Set()
const expensiveThingObservable = new Observable((subscriber) => {
  if (waiting) {
    someReallyExpensiveAsyncThing((err, value) => {
      waiting = false;
      expensiveThing = value;
      const subscribers = Array.from(deferred);
      for (const s of subscribers) {
        if (err) {
          s.error(err);
        } else {
          s.next(value);
          s.complete();
        }
      }
    })
  } else {
    subscriber.next(expensiveThing);
    subscriber.complete();
  }
});

BUT: No one should be using an observable for an asynchronous single value, that's silly. Just use a promise for that. But even that comes with gotchas. For one thing, it's not an amazing idea to keep a promise around as a means of caching a value. Some promises can close over scope that is retained for the life of the promise, that's a runtime implementation thing, but it's still worth avoiding. So even with promises, the "expensive calculation" issue gets a little weird:

let promise;
let waiting = true;
let expensiveThing;
async function getExpensiveThingAsync() {
  if (waiting) {
    promise ??= new Promise((resolve, reject) => {
      someReallyExpensiveAsyncThing((err, value) => {
        waiting = false;
        if (err) {
          reject(err);
          return;
        }
        expensiveThing = value;
        resolve(value);
      })
    });

    return promise;
  }

  return expensiveValue;
}

In the end, Observable is a very low-level type, Promise is a low-level type. Neither are engineered to solve every problem, rather they are building blocks to allow such things to be built. Observable is actually a little more "low-level" than Promise in a number of ways, and could be used to implement promises (or a promise-like thing) but not the other way around.

tbondwilkinson commented 1 year ago

But are there API decisions that could alleviate this concern?

Like an Observable constructor could look more like ReadableStream:

const subscribers = new Map();
const observable = new Observable({
  subscribe(subscriber) {
    subscriber.next(1);
    ...
    subscribers.add(subscriber, metadata);
  },
  abort(subscriber) {
    ...
    subscribers.delete(metadata);
  },
});

In this example, the Observable constructor doesn't look much like Promises at all, and so it's unlikely to cause confusion. It also somewhat alleviates the concerns of #3, because now the constructor does have a teardown method.

This approach doesn't change behavior at all, it just merely changes the shape of the constructor so that it's less of a Promise-look-alike.

benlesh commented 1 year ago

Like an Observable constructor could look more like ReadableStream

That would really complicate the type.

You can do what you're proposing above pretty easily with the proposed API:

const subscribers = new Map();
const observable = new Observable((subscriber) => {
  subscriber.next(1);
  ...
  subscribers.add(subscriber, metadata);
  return () => {
    subscribers.delete(metadata);
  };
});

the Observable constructor doesn't look much like Promises at all, and so it's unlikely to cause confusion.

I'm not sure how to phrase this without it reading as snarky, so apologies in advance: But new Observable(() => {}) and new Promise(() => {}) already look different because they're different constructors. (same with new MutationObserver(() => {}) and new ResizeObserver(() => {}) et al). Observable and Promise have been in such broad use for the last decade that I don't think there's a big risk there. I mean, one could argue that function* () { return 'hi'; } vs function() { return 'hi'; } is problematic because they return different things when you call them, and the only difference there is just one character... Or async function() {} vs function() {}... I don't think new Promise() vs new Observable() is likely to be a footgun.

tbondwilkinson commented 1 year ago

Fair enough, thanks for your responses and thoughts.

Probably worth leaving open in case someone else has the same thought, but as far as resolution it's your speculation that people won't get confused vs mine that that they will, and your speculation is based on a lot more experience with people using Observable, so I tend to weigh yours a little higher.

But it could be worth finding a more rigorous way to answer this question by reaching out to a broader set of people to hear their opinions as well. Now is the time to figure out if API shape deviations from RxJS Observable is worth doing.

domenic commented 1 year ago

Tom asked me to chime in, so my thoughts:

benlesh commented 1 year ago

I think Ben's arguments analogizing to ordinary functions remain confusing

That's fair, it's not for everyone.

I think the ReadableStream-like-constructor idea is very intriguing

It is, but I'm concerned about introducing anything that isn't battle tested. One thing that the original proposal for Observable for the TC39 lacked was a way to set up teardown during the "start" of the subscription. RxJS solved this by having an add(teardown) method on the Subscriber, and we have a signal on the Subscriber here to handle some of that.

That said, I'm open to an API like this if it works and makes sense, as it could easily be added to existing implementations like RxJS in a non-breaking way.

Here's a use case, just as an example, perhaps someone sets up an Observable that starts two different resources, that could error during start up:

Traditional (TC39) Observable

There's a lot for the user to do to make sure this is safe, which is mostly okay, because it's a pretty basic type. Yow might have to do similar things to reject in a Promise constructor.. The twist is you want to make sure teardown happens. So a try/finally must be added to make sure the teardown function is returned in the event of an error.

To complicate things, the returned teardown function is actually added to the subscription after that subscription would already be unsubscribed (implicitly by the subscriber.error call). So you have to make sure that a closed subscription will automatically execute any new teardowns someone tries to add to it.

new Observable(subscriber => {
  let resource1;
  let resource2;
  try {
    resource1 = new Resource1(value => {
      subscriber.next({ from: 1, value });
    });

    resource2 = new Resource2(value => {
      subscriber.next({ from: 1, value });
    });
  } catch (error) {
    // make sure to communicate the error to the consumer.
    subscriber.error(error);
  } finally {
    // make sure we attempt to teardown 
    return () => {
      resource1?.disconnect();
      resource2?.disconnect();
    }
  }
})

RxJS Observable

RxJS attempts to handle all of this by automatically forwarding any errors thrown in the "start" to the subscriber.error, and it also provides a way to add teardowns to a Subscriber (which inherits from Subscription).

This is a little more straightforward, and could easily be rewritten to use subscriber.signal.addEventListener('abort', () => {}) instead of subscriber.add(() => {}) (but you can see the ergonomics issues there.). It's also worth noting that the teardown has been added to the subscriber/subscription prior to the error being thrown and forwarded, so there's no fancy logic really necessary when it comes to dealing with teardowns after the subscription has already been closed by an error.

new Observable(subscriber => {
  let resource1;
  let resource2;
  subscriber.add(() => {
    resource1?.disconnect();
    resource2?.disconnect();
  });

  resource1 = new Resource1(value => {
    subscriber.next({ from: 1, value });
  });

  resource2 = new Resource2(value => {
    subscriber.next({ from: 1, value });
  });
})

Hypothetical Config-based Observable Ctor

This has a few issues. For one thing, how do we create a new resource per-subscription, to share between the two callbacks? We'd need to have some sort of state setup function or the like?

new Observable({
  start(subscriber) {
    // Todo, create two resources.
  },
  cancel() {
    // Todo, teardown those resources if necessary
  }
})

We could do something to have a callback that initializes some shared state on subscription, but it seems needlessly complicated.

new Observable({
  initState() {
    return { resource1: null, resource2: null }
  },
  start(subscriber, state) {
    state.resource1 = new Resource1(value => {
     subscriber.next(value);
    });

    state.resource2 = new Resource2(value => {
     subscriber.next(value);
    });
  },
  cancel(state) {
    state.resource1?.disconnect();
    state.resource2?.disconnect();
  }
})

Alternatively, we could have something that returns the configuration from within a closure to allow the resources to be created, but I have zero doubt that people will mess this up as well

new Observable(() => {
  let resource1;
  let resource2;
  return {
    start(subscriber) {
      resource1 = new Resource1(value => {
       subscriber.next(value);
      });

      resource2 = new Resource2(value => {
       subscriber.next(value);
      });
    },
    cancel() {
      resource1?.disconnect();
      resource2?.disconnect();
    }
  }
})

Anything else I'm thinking of here seems like it would force eagerness and multicast (closing over outside instances of the resources, manual reference counting, etc), which really dilutes the possible uses of the type and really makes it not an observable.

The crux of it is that the start function is really responsible for creating the things that you'd want to cancel. Therefore it makes sense to hand control over registering cancellation/teardown to the start function, which brings us back to the original (very battle-tested) design.

benlesh commented 1 year ago

(I'm truly sorry that all of this ends up being a novel, I'm just trying to be thorough 😅 )

domenic commented 1 year ago

The way we'd handle that with the streams-like pattern is as follows:

new Observable({
  start(subscriber, state) {
    this.resource1 = new Resource1(value => {
     subscriber.next(value);
    });

    this.resource2 = new Resource2(value => {
     subscriber.next(value);
    });
  },
  cancel(state) {
    this.resource1?.disconnect();
    this.resource2?.disconnect();
  }
});
benlesh commented 1 year ago

I'm amenable to this design. My only concern would be it's new. I'd want to fiddle with it to see if it was able to meet the same needs/use cases. One concern is that it would force it to be multicast, or require ref counting. But I suppose it could be made to not work like that. But maybe that's not a big deal. It requires thought.

domenic commented 11 months ago

I guess the addTeardown() method that was recently added is an alternative to the API shape above, providing the teardown to the constructor? I still think the constructor version is more idiomatic...

domfarolino commented 9 months ago

So we're discussing the possibility of having a bag of options passed into the Observable constructor, instead of just a callback which is responsible for managing teardown from within itself. This means we'd have to have a way to share state between the passed-in start/subscribe callback, and the stop/cancel/teardown callback. But I'm a little confused about what was posed in https://github.com/WICG/observable/issues/22#issuecomment-1586711085. There are three pieces of state there:

domenic commented 9 months ago
  • this: I actually can't tell if this points to the Observable instance (shared between all subscriptions) or the object passed into the constructor that holds the two callbacks (also shared between all subscriptions). I think it's the latter, but either way it seems like it's not a per-subscription based object, right?

The latter was the intent.

But if I understand correctly, this is bad: you need per-subscription state, and there's only a single per-Observable object being passed in my example.

If I understand correctly that you need per-subscription state, then I rescind my idea: it is not a good fit and will be a footgun.

Given that I'm not sure where we are with a solution to this issue. My idea, with this, had the benefit of being (IMO) a natural JS pattern, and one that the web platform already uses with Streams. Without that I think we need to go back to considering less-nice patterns, whether it's the original cleanup-function-returning, or something like encouraging people to tack pieces of state onto the subscriber object, or something else.

domfarolino commented 9 months ago

Without that I think we need to go back to considering less-nice patterns, whether it's the original cleanup-function-returning, or something like encouraging people to tack pieces of state onto the subscriber object, or something else.

I definitely share your opinion about not liking the "return a cleanup function" from the subscriber callback (i.e., the callback that gets passed into the Observable ctor, just so we're all clear). It seems very un-idiomatic. Since the Subscriber object is the obvious per-subscription mechanism, then I think per-subscription cleanups would have to be associated with it. So I subscriber.addTeardown() makes a lot of sense there, and naturally lets each invocation of the subscriber callback register its own cleanup(s) etc.

domfarolino commented 8 months ago

Currently the explainer & Chromium implementation (behind a flag) has been using the addTeardown() design expressed here, to add per-Subscription teardown callbacks. I think this design is satisfactory so I'm going to close this issue, but happy to re-open if we want to continue the discussion.