tc39 / proposal-observable

Observables for ECMAScript
https://tc39.github.io/proposal-observable/
3.07k stars 90 forks source link

Discussion: Subscribe returns a cancellation function #48

Closed zenparsing closed 9 years ago

zenparsing commented 9 years ago

In the current version, subscribe returns a cancellation function. Previously, subscribe returned a "subscription" object with an "unsubscribe" method.

Rationale:

  1. Since working on this project, I've typed in "subscription.unsubscribe" many, many times. It's ugly and a real pain.
  2. We currently haven't identified any essential capabilities that need to be returned from "subscribe" other than the cancellation function. It's nice to avoid useless wrapper objects.
  3. Any additional capabilities returned from "subscribe" are probably going to be POLA-problematic.
  4. The return types of "subscribe" and the subscriber function align nicely.

Regarding 4:

We frequently want to "chain" observables like this:

function x(observable) {
    return new Observable(observer => observable.subscribe({
        // ... whatever
    });
}

If "subscribe" returns something other than a () => void function, then we have to bake-in some messy overloading in order to make that chaining nice and pleasant. RxJS makes use of overloading in several places (also "subscribe") which I would like to avoid.

Thoughts?

benlesh commented 9 years ago

The inevitability for anyone using Observable is that they're going to need some sort of CompositeSubscription. Doing this with function composition is likely to be slow and not very ergonomic.

benlesh commented 9 years ago

There will have to be some mechanism to wrap every unsubscription function with another function that closes over an isUnsubscribed variable and ensures that unsubscription doesn't happen more than once.

There's state involved, is I guess what I'm saying. And when there's state involved, I think it's better to return an object from an API, because it enables that state to be examined and/or at least neatly packaged with the unsubscribe method.

benlesh commented 9 years ago

Playing around with a merge operator and this design. It's not too bad. Basically pushing functions onto an array to be called in the outer unsubscribe returned. So my first comment probably isn't a big deal.

I'm standing with my second point though. That there is state involved and it should be packaged with the returned Subscription object.

Also objects are much, much easier to extend with new methods and properties later, where returning a function is going to design you into a corner.

zenparsing commented 9 years ago

@blesh With the performance measurement tools you have in RxNext, is it feasible to get some metrics on performance differences between returning a cancellation function versus a subscription object?

benjamingr commented 9 years ago

An object should be a little cheaper, but I don't think it would matter with the amount of observable a creates

benlesh commented 9 years ago

@zenparsing Yes, we could put some tests around that again. Back when we started RxJS Next, I tried a version with just a returned unsubscribe function and it was slower only because I had to either close over something or I had to use function bind. Generally, I've found when there is state involved with a function it's best to have an object with the function as a method.

zenparsing commented 9 years ago

@blesh Can you link m to some of the code that you use for performance testing?

benjamingr commented 9 years ago

The conclusions @blesh outlined are also common sense. It's definitely faster to pass data and access a function over it than to pass a function that captures over data.

The question is whether or not the difference is noticeable in real world use cases to even consider. IMO returning an object with a function rather than just a function in the case of unsubscribe also just makes for a nicer/cleaner API. So I prefer it regardless.

zenparsing commented 9 years ago

@benjamingr I'm curious why you think returning a subscription object is cleaner. Returning a cancellation function feels a whole lot cleaner and more idiomatic to me. Can you come up with an example which demonstrates your point of view?

benjamingr commented 9 years ago

Well, when you do:

var ret = obs.subscribe();

Doing:

ret.dispose(); // or unsubscribe

Is a lot more explicit then:

ret()

It's a tad longer to write, but IMO it's more explicit, and it allows adding methods to the returned value (like, for example: making it thenable) by the specification and/or by userland implementations at a later time (which, I'm aware is possible with a function, but is quirkier).

RangerMauve commented 9 years ago

On the other hand, if it's just a function it'd be handy for passing it as a callback to an event somewhere. If it's a method on some object then it'd mean whatever you pass the object to would have to call the method itself, and just passing the method by reference probably won't work in the native implementation since these things usually expect some sort of this value from what I've seen (For example Promise.all needs to be called as a method of Promise).

var dispose = obs.subscribe();

socket.onclose = dispose;
RangerMauve commented 9 years ago

Though I agree with having an object be returned for the reasons of extensiblity that you described.

domenic commented 9 years ago

@benjamingr sure, if you name your variables silly things like ret, it's a lot less clear. Name your variable disposeObs and things are fine.

zenparsing commented 9 years ago

it allows adding methods to the returned value (like, for example: making it thenable) by the specification and/or by userland implementations at a later time

What might be added later? It seems like, given the long experience with Rx.NET and RxJS, we should have a good idea already about additional capabilities that need to be returned from subscribe. Maybe someone with more experience there can comment?

Anyway, I don't really care too deeply about the return type of subscribe. What I care about is that:

  1. We allow the subscriber function to return a cleanup function:

    new Observable(sink => {
       // Initialize it
       return _=> { /* Cleanup */ };
    });
  2. We avoid any ugly overloading of the return value of the subscriber function.

    new Observable(sink => {
       function f() {}
       f.unsubscribe = function() {};
       return f; // Cleanup function or "unsubscribe-able"??
    });

If subscribe returns a function, then it all fits together nicely without overloading:

new Observable(sink => {
    // Easy chaining:
    return someOtherObservable.subscribe({
        // Yadda yadda
    });
});
benlesh commented 9 years ago

What might be added later? It seems like, given the long experience with Rx.NET and RxJS, we should have a good idea already about additional capabilities that need to be returned from subscribe.

In RxJS Next, we added the ability to add and remove subscriptions to an existing subscription. Effectively making all Subscriptions the equivalent of "CompositeDisposables" in current RxJS. Composite Subscriptions are necessary for operations like merge where you're creating one subscription with N underlying subscriptions (for each Observable you're merging in). This holds true for a laundry list of operations: combineLatest, withLatestFrom, zip, flatMap for example have N child subscriptions. Then there are those with a single child subscription like concat or switch etc.

benlesh commented 9 years ago

I guess a basic Subscription contract, as it as evolved to exist in RxJS Next is:

interface Subscription {
  unsubscribe(): void
  isUnsubscribed: boolean

  add(sub: Subscription): Subscription
  remove(sub: Subscription): Subscription
  length: number
}

The second half of that interface is to support the "composite" behavior I mentioned above.

The first half is obviously more important. I think the only point some might debate is the usefulness of isUnsubscribed (or a similar boolean). I've found it useful to be able to examine the state of a subscription when writing apps, but that could obviously be done in many ways, such as a side effect on completion or error. The nice thing about isUnsubscribed, though, is that it accounts for error, completion and unsubscription. In any of those cases isUnsubscribed would be true. (It may not always guarantee the underlying unsubscribe logic has been called though, I'll need to check on that)

benlesh commented 9 years ago

@zenparsing ... you asked where the perf tests are for RxJS Next... they're under a directory perf, there are both micro and macro perf tests.

The tests have changed quite a bit since I was experimenting with @jhusain and @trxcllnt months ago on various implementations of Observable and it's supporting types. It's probably worth adding some macro performance tests specifically around unsubscription of large composite subscriptions.

Currently the tests are focus on comparing operations with the previous version of RxJS.

trxcllnt commented 9 years ago

To be clear, future versions of ReactiveX/RxJS will be removing the ability to return an unsubscribe function/Subscription from subscribe altogether, in favor of injecting the Subscriber with the Subscription by calling subscriber.start(subscription) before next'ing into it.

zenparsing commented 9 years ago

@trxcllnt I am not in favor of that approach.

trxcllnt commented 9 years ago

@zenparsing why?

RangerMauve commented 9 years ago

I find it makes things a lot more tedious to work with. One of the things that I like about Promises, for example, is that the interface is super simple. You don't have to have a bunch of different complex objects interacting with each other, and you can just throw one out and have it work well. When you have to create an instance of a Subscriber just to use an observable, then it gets really tedious and verbose to do anything. Having to use classes out the wazoo also isn't very much in tune with the way JS APIs are going in the native side AFAIK.

zenparsing commented 9 years ago

@trxcllnt Because I find it that API to be "both odious and error-prone" (to quote myself). Compare these two implementations of takeUntil:

https://gist.github.com/zenparsing/6c9c51f3d8d4187ae84a

trxcllnt commented 9 years ago

@RangerMauve to be clear, you don't have to create an instance of Subscriber to work with Observables. The public subscribe method works with 3 callbacks or an Observer, and creates the initial Subscriber internally for you (either wrapping the 3 callbacks or the Observer). In the current ReactiveX/RxJS project, this initial Subscriber is used as the shared CompositeSubscriber, and all subsequent inner Subscribers proxy to it.

Similarly, the subscriber.start(subscription) call would also happen automatically in the first call to the public subscribe method, so custom Observables wouldn't need to implement this themselves.

trxcllnt commented 9 years ago

@zenparsing this is a false dichotomy, as that's not the only way to implement the feature.

zenparsing commented 9 years ago

@trxcllnt I would very much like to see another non-buggy implementation of takeUntil using the start-based API, without resorting to intermediate helper functions.

RangerMauve commented 9 years ago

@trxcllnt Would Subscriber be just an internal detail with the API remaining the same as what @zenparsing described?

trxcllnt commented 9 years ago

@zenparsing @RangerMauve yes, Subscriber is effectively an implementation detail. Subscriber extends Subscription and implements Observer.

The Observer API changes to this:

interface Observer<T> {
  start(subscription: Subscription<T>): void;
  next(value: T): void;
  error(err?: any): void;
  complete(): void;
  isUnsubscribed: boolean;
}

Observable's public subscribe method changes to this:

  subscribe(observerOrNext?: Observer<T> | ((value: T) => void),
            error?: (error: T) => void,
            complete?: () => void,
            start?: () => void): Subscription<T> {

    let subscriber: Subscriber<T>;

    if (observerOrNext && typeof observerOrNext === "object") {
      if(observerOrNext instanceof Subscriber) {
        subscriber = (<Subscriber<T>> observerOrNext);
      } else {
        subscriber = new Subscriber(<Observer<T>> observerOrNext);
      }
    } else {
      const next = <((x?) => void)> observerOrNext;
      subscriber = Subscriber.create(next, error, complete, start);
    }

    subscriber.start(subscriber);

    if (!subscriber.isUnsubscribed) {
      this._subscribe(subscriber);
    }

    return subscriber;
  }

If the Observer synchronously unsubscribes, the source Observable's private _subscribe method is never invoked, and the start logic never leaks into operator or custom Observable implementations.

jhusain commented 9 years ago

Anyone care to take on @zenparsing's challenge? Frankly the contrast isn't flattering. Is this just one example of an API that's harder with injected subscriptions? is the sample size too small. Are there any other APIs that are easier with injected subscriptions?

JH

On Sep 3, 2015, at 2:28 PM, Paul Taylor notifications@github.com wrote:

@zenparsing @RangerMauve yes, Subscriber is effectively an implementation detail. Subscriber extends Subscription and implements Observer.

The Observer API changes to this:

interface Observer { start(subscription: Subscription): void; next(value: T): void; error(err?: any): void; complete(): void; isUnsubscribed: boolean; } Observable's public subscribe method changes to this:

subscribe(observerOrNext?: Observer | ((value: T) => void), error?: (error: T) => void, complete?: () => void, start?: () => void): Subscription {

let subscriber: Subscriber<T>;

if (observerOrNext && typeof observerOrNext === "object") {
  if(observerOrNext instanceof Subscriber) {
    subscriber = (<Subscriber<T>> observerOrNext);
  } else {
    subscriber = new Subscriber(<Observer<T>> observerOrNext);
  }
} else {
  const next = <((x?) => void)> observerOrNext;
  subscriber = Subscriber.create(next, error, complete, start);
}

subscriber.start(subscriber);

if (!subscriber.isUnsubscribed) {
  this._subscribe(subscriber);
}

return subscriber;

} If the Observer synchronously unsubscribes, the source Observable's private _subscribe method is never invoked, and the start logic never leaks into operator or custom Observable implementations.

— Reply to this email directly or view it on GitHub.

benlesh commented 9 years ago

@zenparsing @jhusain a little clarification on the RxJS Next front: The observer signature with the start method that receives the subscriber is an idea we've kicked around, however nothing will be implemented in RxJS Next without perf specs to back it up. We have simple goals:

And a corner goal to match this spec. Just want this spec to settle down a bit first.

One of the attractions to the start method outlined above, is in the long term some of the server teams here would like a (separate) implementation of an "RxFlowable" in JS, which is really just the RxJava 2 style of Observables with back-pressure. If both the Flowable and the Observable types match with the exception of Flowable's subscriber having a request method, it might be beneficial.

In order for anything to land in RxJS Next, it will have to prove better perf and smaller call stacks first and foremost, then it will have to show good ergonomics, etc.

benlesh commented 9 years ago

@zenparsing as far as perf work goes, right now apparently an update to Chrome has broken RxJS Next's macro perf tests, so I wouldn't be able to test any other subscription configurations at this time. Also, it's a low priority to me, as I've tested this before and it ended up being slower.

trxcllnt commented 9 years ago

@blesh start also enables synchronous unsubscription, something that's not possible with the present implementation.

@jhusain the current implementation of ReactiveX/RxJS with the changes I outlined is an example of the start approach, but it doesn't force Operators or Observables from knowing/caring about creating/injecting a Subscription before emitting. I'll put a PR together to compare the perf difference.

benlesh commented 9 years ago

@trxcllnt ah yes. Thank you for the reminder. That's a big deal. Do we have an issue tracking that?

zenparsing commented 9 years ago

@trxcllnt It appears that the subscribe implementation you posted still returns the subscription object. Is that right? Also, it looks like the subscriber function can still return a cleanup function. True?

@blesh No worries - the links to your perf scripts have pointed me in the right direction. Thanks!

trxcllnt commented 9 years ago

@zenparsing are you familiar with how we've implemented ReactiveX/RxJS? I worry my tone comes across as patronizing sometimes, but I err on the side of over-explaining when I'm not clear how much you know about what we've done.

The public subscribe returns the Subscription (so whoever called subscribe can unsubscribe if they choose), but my proposed change ignores anything returned by the private _subscribe.

When you create an Observable via the constructor, the constructor argument subscribe is set as the Observable's private _subscribe. If you create an Observable via lift, the Observable returned by lift will use the Observable prototype's _subscribe, which is optimized for lifted Observables.

With my proposed change, there's no extra logic in custom Observables or Observable operators. RxJS's takeUntil is the class-based lift equivalent to your closure-based example. If you take into account the fact that we don't need to manually dispose of the inner Subscription (because all Subscribers share the same underlying CompositeSubscription, which disposes of all inner subscriptions when the Observable completes/errors/is unsubscribed from), we have one less concern than the closure approach.

Readying a PR to ReactiveX/RxJS to compare the perf difference now.

zenparsing commented 9 years ago

@trxcllnt Thanks for the explanation. So under your proposal, subscribe actually does return the unsubscription capability, which is different from what was mentioned earlier, namely that subscribe would have no return value.

But you say that the subscriber function (what you're calling _subscribe) can no longer return anything useful. If so, then how does the subscriber function register it's cleanup routine?

new Observable(sink => {
    // I can't return my cleanup action, so how do I specify it?
});
trxcllnt commented 9 years ago

@zenparsing the subscriber function takes a Subscriber (not an Observer), which implements both Subscription and Observer. The subscriber function adds any Subscribers/unsubscribe functions to the subscriber argument, e.g. subscriber.add(Subscription | unsubscribeFunction).

trxcllnt commented 9 years ago

@zenparsing this works because the subscriber argument also proxies adding/removing subscriptions to the shared underlying "composite" Subscription.

zenparsing commented 9 years ago

@trxcllnt Ah, that makes more sense. You register the cleanup action via "Subscription.prototype.add" and "subscribe" returns the subscription.

Still, this setup requires a significant increase in the conceptual complexity of the API. I'm still strongly opposed.

trxcllnt commented 9 years ago

@zenparsing reasonable people can disagree about levels of conceptual complexity. we discovered this approach was much faster than supporting returning Subscription | unsubscribeFunction | void from _subscribe.

zenparsing commented 9 years ago

@trxcllnt What do you mean by "faster"?

benlesh commented 9 years ago

I feel like I lost track of where this conversation was going and thought it was done. For some reason I thought you were going to go with objects not functions. Perhaps it was a conversation I had with @jhusain in the hallway.

Regardless of whether or not it's "faster", I strongly feel subscriptions should be represented as objects, not just functions. Objects represent that there is more to the unsubscribe function than a simple call. The embody the idea that there is state underneath that call. They're also much more extensible than a function. Finally, I think the object should have an isUnsubscribed property to reveal an important piece of that state.

Subscription objects enable simple composite subscription types

If you have a subcription class, it could be subclassed or a duck-type could be made to support "composite" subscriptions. "Composite subscriptions" are an absolute necessity to developing many operators such as flatMap, merge, zip, retryWhen, etc. Basically an add and remove method (names could very) can easily be added to a subclass of subscription that maintains an inner collection of child subscriptions to unsubscribe from when unsubscribe() is called on the composite.

Could you do this with functions? Sure, but it will be painful and inconsistently implemented. In some cases, you'll need to have your new Observable's subscriber function return a function that closes over two inner unsubscribe functions. In other cases you'll need to maintain an array, push-ing and splice-ing inner unsubscribe functions on and off of it, and returning from your new Observables subscriber function am unsubscribe function that loops over that array calling all of it's unsubscribe functions... and hopefully, all of those unsubscribe functions are implemented in such a way they don't allow themselves to be called twice.

This is handled much more cleanly by a Subscription class, which can be subclassed for use as a CompositeSubscription or even as a "Serial" Subscription for operators like switch, where maintain only one inner subscription at a time, and dispose of the previous as new ones start.

Subscription objects are helpful for making zip

For example, if someone were to try to make a zip operator with the current returned functions, it would be more difficult than it would with a subscription object that had a unsubscribe() method and an isUnsubscribed property on it. This is because in zip the logic to complete the zip depends on checking to see if any one subscription is both unsubscribed and it's buffer is empty.

The expected outcome would be this:

(In the "marble" diagrams below, - represents the passage of time, letters represent events, and | represents a completion.)

O1:  ----a------b-------c---d----|
O2:  ------e-------f------------------g----h------i----j-|
R:   ------(w)-----(x)----------------(y)--(z|)

where

let R = O1.zip(O2, (a, b) => a + b);

and

w = a + e
x = b + f
y = c + g
z = d + h

Notice that in the marbles above R emits z and completes (|) in the same frame. This is because O1's subscription isUnsubscribed and it's buffer is empty. Take note that O1's subscription is unsubscribed by the time g is emitted from O2, but it still has d in the buffer, so R is not complete. But as soon as O2 emits h, R can emit z, and O1's subscription is both isUnsubscribed and the buffer is empty.

benlesh commented 9 years ago

This matter was made a little more urgent by my working with the folks at Bacon and Most.js on getting the interop points integrated into all of our libraries. Also I think Kefir is doing the same. I was under the incorrect assumption that we had settled on Subscription objects being returned. I don't think they're under that assumption, but I strongly disagree with returning a function, and I'd hate for it to become a standard because of momentum.

jhusain commented 9 years ago

@zenparsing one concern I have is that there has been some discussion in various quarters about adding an explicit resource disposal interface to JavaScript. Under the circumstances I would prefer if the result of subscribe were an object so that we can add the symbol to it later. Seems somewhat odd to add a symbol to a function retroactively.

On Sat, Sep 26, 2015 at 6:37 PM, Ben Lesh notifications@github.com wrote:

This matter was made a little more urgent by my working with the folks at Bacon and Most.js on getting the interop points integrated into all of our libraries. Also I think Kefir is doing the same. I was under the incorrect assumption that we had settled on Subscription objects being returned. I don't think they're under that assumption, but I strongly disagree with returning a function, and I'd hate for it to become a standard because of momentum.

— Reply to this email directly or view it on GitHub https://github.com/zenparsing/es-observable/issues/48#issuecomment-143511819 .

ljharb commented 9 years ago

I prefer the object to the function - but while I'm not sure how likely resource disposal would be to ever happen; functions have Symbol.species, so I don't think there'd be any difference between what kind of object (function or not) was returned and sticking a symbol on it.

domenic commented 9 years ago

Yes, object vs. function is the wrong way to look at this. This is simply "should the returned object have a call behavior or not".

@blesh, not to pick on you, but just to drive home the point:

The embody the idea that there is state underneath that call.

This seems like a very Java-esque point of view. In JS, or any language with closures, functions are a great way to embody state. Indeed, objects are often just dumb property bags, whereas functions are much more likely to carry state with them.

They're also much more extensible than a function.

This is just not true; as I said, it's just a matter of a call behavior or not. Functions are objects.

Finally, I think the object should have an isUnsubscribed property to reveal an important piece of that state.

A function could have that. A bit unconventional, but not a big deal.


All that said, I don't think this is a big deal, and don't have a strong opinion. I just wanted to shift the perspective here, since a lot of the assumptions about functions vs. objects seem wrong here.

benlesh commented 9 years ago

This seems like a very Java-esque point of view.

JavaScript is a language with a class keyword, and implementations with slow closures and optimized classes. A language that is apparently imperative with functional features.

The assumption about objects vs functions is no assumption. It seems the language was designed around it. Promises could be implemented as just functions. They weren't.

zenparsing commented 9 years ago

@blesh Thanks for reviving this thread. I don't have much time to write today but I'll post my thoughts tomorrow.

Briefly, I think @domenic has a good point here. There are some places in Rx where method signatures are overloaded based on whether an argument is a function or object. But in my opinion, these overloads are questionable in the sense that functions are objects.

One such overload is the return type of the subscriber function. For convenience, we want to be able to return a cleanup function like this:

new Observable(sink => {
    return _=> { console.log("This is my cleanup function") };
});

But we also want to allow convenient chaining, like this:

new Observable(sink => {
    return otherObservable.subscribe(yaddaYadda);
});

The only way to get both, if subscribe doesn't return a callable object, is to have the questionable overload semantics that I described above. It's unclean : )

However, if subscribe returns a function, then everything just kinda fits together naturally.

I'm glad we have a concrete example in the form of zip. I think we need to see a side-by-side comparison between an implementation that uses subscription objects versus returned functions to properly evaluate. I'll try to work on that tomorrow.

benjamingr commented 9 years ago

@zenparsing I'm pretty indifferent at this point as both seem relatively comparable functionally and performance wise (although, I'm still slightly in favor or not returning cancellation functions directly since IMO it's not a very familiar JS API type, and it's less explicit).

If you returned an object with bound functions you could just do:

new Observable(sink => {
    return otherObservable.subscribe(yaddaYadda).dispose;
});
zenparsing commented 9 years ago

@benjamingr I doubt that we would get away with speccing it as a bound function. You'd have to do something like:

new Observable(sink => {
    let s = otherObservable.subscribe(yaddaYadda);
    return _=> s.dispose();
});
benjamingr commented 9 years ago

I doubt we could get away as speccing the return value as a function, but if you believe we can it might be worth it :)

new Observable(sink => {
     return () => otherObservable.subscribe(yaddaYadda).dispose(); 
})

Isn't too bad, but I agree it's less clean.