Closed zenparsing closed 9 years ago
The inevitability for anyone using Observable is that they're going to need some sort of CompositeSubscription. Doing this with function composition is likely to be slow and not very ergonomic.
There will have to be some mechanism to wrap every unsubscription function with another function that closes over an isUnsubscribed
variable and ensures that unsubscription doesn't happen more than once.
There's state involved, is I guess what I'm saying. And when there's state involved, I think it's better to return an object from an API, because it enables that state to be examined and/or at least neatly packaged with the unsubscribe method.
Playing around with a merge operator and this design. It's not too bad. Basically pushing functions onto an array to be called in the outer unsubscribe returned. So my first comment probably isn't a big deal.
I'm standing with my second point though. That there is state involved and it should be packaged with the returned Subscription object.
Also objects are much, much easier to extend with new methods and properties later, where returning a function is going to design you into a corner.
@blesh With the performance measurement tools you have in RxNext, is it feasible to get some metrics on performance differences between returning a cancellation function versus a subscription object?
An object should be a little cheaper, but I don't think it would matter with the amount of observable a creates
@zenparsing Yes, we could put some tests around that again. Back when we started RxJS Next, I tried a version with just a returned unsubscribe function and it was slower only because I had to either close over something or I had to use function bind. Generally, I've found when there is state involved with a function it's best to have an object with the function as a method.
@blesh Can you link m to some of the code that you use for performance testing?
The conclusions @blesh outlined are also common sense. It's definitely faster to pass data and access a function over it than to pass a function that captures over data.
The question is whether or not the difference is noticeable in real world use cases to even consider. IMO returning an object with a function rather than just a function in the case of unsubscribe also just makes for a nicer/cleaner API. So I prefer it regardless.
@benjamingr I'm curious why you think returning a subscription object is cleaner. Returning a cancellation function feels a whole lot cleaner and more idiomatic to me. Can you come up with an example which demonstrates your point of view?
Well, when you do:
var ret = obs.subscribe();
Doing:
ret.dispose(); // or unsubscribe
Is a lot more explicit then:
ret()
It's a tad longer to write, but IMO it's more explicit, and it allows adding methods to the returned value (like, for example: making it then
able) by the specification and/or by userland implementations at a later time (which, I'm aware is possible with a function, but is quirkier).
On the other hand, if it's just a function it'd be handy for passing it as a callback to an event somewhere. If it's a method on some object then it'd mean whatever you pass the object to would have to call the method itself, and just passing the method by reference probably won't work in the native implementation since these things usually expect some sort of this value from what I've seen (For example Promise.all needs to be called as a method of Promise).
var dispose = obs.subscribe();
socket.onclose = dispose;
Though I agree with having an object be returned for the reasons of extensiblity that you described.
@benjamingr sure, if you name your variables silly things like ret
, it's a lot less clear. Name your variable disposeObs
and things are fine.
it allows adding methods to the returned value (like, for example: making it thenable) by the specification and/or by userland implementations at a later time
What might be added later? It seems like, given the long experience with Rx.NET and RxJS, we should have a good idea already about additional capabilities that need to be returned from subscribe. Maybe someone with more experience there can comment?
Anyway, I don't really care too deeply about the return type of subscribe. What I care about is that:
We allow the subscriber function to return a cleanup function:
new Observable(sink => {
// Initialize it
return _=> { /* Cleanup */ };
});
We avoid any ugly overloading of the return value of the subscriber function.
new Observable(sink => {
function f() {}
f.unsubscribe = function() {};
return f; // Cleanup function or "unsubscribe-able"??
});
If subscribe returns a function, then it all fits together nicely without overloading:
new Observable(sink => {
// Easy chaining:
return someOtherObservable.subscribe({
// Yadda yadda
});
});
What might be added later? It seems like, given the long experience with Rx.NET and RxJS, we should have a good idea already about additional capabilities that need to be returned from subscribe.
In RxJS Next, we added the ability to add and remove subscriptions to an existing subscription. Effectively making all Subscriptions the equivalent of "CompositeDisposables" in current RxJS. Composite Subscriptions are necessary for operations like merge
where you're creating one subscription with N underlying subscriptions (for each Observable you're merging in). This holds true for a laundry list of operations: combineLatest
, withLatestFrom
, zip
, flatMap
for example have N child subscriptions. Then there are those with a single child subscription like concat
or switch
etc.
I guess a basic Subscription contract, as it as evolved to exist in RxJS Next is:
interface Subscription {
unsubscribe(): void
isUnsubscribed: boolean
add(sub: Subscription): Subscription
remove(sub: Subscription): Subscription
length: number
}
The second half of that interface is to support the "composite" behavior I mentioned above.
The first half is obviously more important. I think the only point some might debate is the usefulness of isUnsubscribed
(or a similar boolean). I've found it useful to be able to examine the state of a subscription when writing apps, but that could obviously be done in many ways, such as a side effect on completion or error. The nice thing about isUnsubscribed
, though, is that it accounts for error, completion and unsubscription. In any of those cases isUnsubscribed would be true. (It may not always guarantee the underlying unsubscribe logic has been called though, I'll need to check on that)
@zenparsing ... you asked where the perf tests are for RxJS Next... they're under a directory perf, there are both micro and macro perf tests.
The tests have changed quite a bit since I was experimenting with @jhusain and @trxcllnt months ago on various implementations of Observable and it's supporting types. It's probably worth adding some macro performance tests specifically around unsubscription of large composite subscriptions.
Currently the tests are focus on comparing operations with the previous version of RxJS.
To be clear, future versions of ReactiveX/RxJS will be removing the ability to return an unsubscribe function/Subscription from subscribe
altogether, in favor of injecting the Subscriber with the Subscription by calling subscriber.start(subscription)
before next'ing into it.
@trxcllnt I am not in favor of that approach.
@zenparsing why?
I find it makes things a lot more tedious to work with. One of the things that I like about Promises, for example, is that the interface is super simple. You don't have to have a bunch of different complex objects interacting with each other, and you can just throw one out and have it work well. When you have to create an instance of a Subscriber
just to use an observable, then it gets really tedious and verbose to do anything. Having to use classes out the wazoo also isn't very much in tune with the way JS APIs are going in the native side AFAIK.
@trxcllnt Because I find it that API to be "both odious and error-prone" (to quote myself). Compare these two implementations of takeUntil:
@RangerMauve to be clear, you don't have to create an instance of Subscriber
to work with Observables. The public subscribe
method works with 3 callbacks or an Observer
, and creates the initial Subscriber
internally for you (either wrapping the 3 callbacks or the Observer
). In the current ReactiveX/RxJS project, this initial Subscriber
is used as the shared CompositeSubscriber, and all subsequent inner Subscribers proxy to it.
Similarly, the subscriber.start(subscription)
call would also happen automatically in the first call to the public subscribe
method, so custom Observables wouldn't need to implement this themselves.
@zenparsing this is a false dichotomy, as that's not the only way to implement the feature.
@trxcllnt I would very much like to see another non-buggy implementation of takeUntil using the start-based API, without resorting to intermediate helper functions.
@trxcllnt Would Subscriber
be just an internal detail with the API remaining the same as what @zenparsing described?
@zenparsing @RangerMauve yes, Subscriber
is effectively an implementation detail. Subscriber
extends Subscription
and implements Observer
.
The Observer API changes to this:
interface Observer<T> {
start(subscription: Subscription<T>): void;
next(value: T): void;
error(err?: any): void;
complete(): void;
isUnsubscribed: boolean;
}
Observable's public subscribe method changes to this:
subscribe(observerOrNext?: Observer<T> | ((value: T) => void),
error?: (error: T) => void,
complete?: () => void,
start?: () => void): Subscription<T> {
let subscriber: Subscriber<T>;
if (observerOrNext && typeof observerOrNext === "object") {
if(observerOrNext instanceof Subscriber) {
subscriber = (<Subscriber<T>> observerOrNext);
} else {
subscriber = new Subscriber(<Observer<T>> observerOrNext);
}
} else {
const next = <((x?) => void)> observerOrNext;
subscriber = Subscriber.create(next, error, complete, start);
}
subscriber.start(subscriber);
if (!subscriber.isUnsubscribed) {
this._subscribe(subscriber);
}
return subscriber;
}
If the Observer synchronously unsubscribes, the source Observable's private _subscribe
method is never invoked, and the start
logic never leaks into operator or custom Observable implementations.
Anyone care to take on @zenparsing's challenge? Frankly the contrast isn't flattering. Is this just one example of an API that's harder with injected subscriptions? is the sample size too small. Are there any other APIs that are easier with injected subscriptions?
JH
On Sep 3, 2015, at 2:28 PM, Paul Taylor notifications@github.com wrote:
@zenparsing @RangerMauve yes, Subscriber is effectively an implementation detail. Subscriber extends Subscription and implements Observer.
The Observer API changes to this:
interface Observer
{ start(subscription: Subscription ): void; next(value: T): void; error(err?: any): void; complete(): void; isUnsubscribed: boolean; } Observable's public subscribe method changes to this: subscribe(observerOrNext?: Observer
| ((value: T) => void), error?: (error: T) => void, complete?: () => void, start?: () => void): Subscription { let subscriber: Subscriber<T>; if (observerOrNext && typeof observerOrNext === "object") { if(observerOrNext instanceof Subscriber) { subscriber = (<Subscriber<T>> observerOrNext); } else { subscriber = new Subscriber(<Observer<T>> observerOrNext); } } else { const next = <((x?) => void)> observerOrNext; subscriber = Subscriber.create(next, error, complete, start); } subscriber.start(subscriber); if (!subscriber.isUnsubscribed) { this._subscribe(subscriber); } return subscriber;
} If the Observer synchronously unsubscribes, the source Observable's private _subscribe method is never invoked, and the start logic never leaks into operator or custom Observable implementations.
— Reply to this email directly or view it on GitHub.
@zenparsing @jhusain a little clarification on the RxJS Next front: The observer signature with the start
method that receives the subscriber
is an idea we've kicked around, however nothing will be implemented in RxJS Next without perf specs to back it up. We have simple goals:
And a corner goal to match this spec. Just want this spec to settle down a bit first.
One of the attractions to the start
method outlined above, is in the long term some of the server teams here would like a (separate) implementation of an "RxFlowable" in JS, which is really just the RxJava 2 style of Observables with back-pressure. If both the Flowable and the Observable types match with the exception of Flowable's subscriber having a request
method, it might be beneficial.
In order for anything to land in RxJS Next, it will have to prove better perf and smaller call stacks first and foremost, then it will have to show good ergonomics, etc.
@zenparsing as far as perf work goes, right now apparently an update to Chrome has broken RxJS Next's macro perf tests, so I wouldn't be able to test any other subscription configurations at this time. Also, it's a low priority to me, as I've tested this before and it ended up being slower.
@blesh start
also enables synchronous unsubscription, something that's not possible with the present implementation.
@jhusain the current implementation of ReactiveX/RxJS with the changes I outlined is an example of the start
approach, but it doesn't force Operators or Observables from knowing/caring about creating/injecting a Subscription before emitting. I'll put a PR together to compare the perf difference.
@trxcllnt ah yes. Thank you for the reminder. That's a big deal. Do we have an issue tracking that?
@trxcllnt It appears that the subscribe
implementation you posted still returns the subscription object. Is that right? Also, it looks like the subscriber function can still return a cleanup function. True?
@blesh No worries - the links to your perf scripts have pointed me in the right direction. Thanks!
@zenparsing are you familiar with how we've implemented ReactiveX/RxJS? I worry my tone comes across as patronizing sometimes, but I err on the side of over-explaining when I'm not clear how much you know about what we've done.
The public subscribe
returns the Subscription
(so whoever called subscribe
can unsubscribe if they choose), but my proposed change ignores anything returned by the private _subscribe
.
When you create an Observable via the constructor, the constructor argument subscribe
is set as the Observable's private _subscribe
. If you create an Observable via lift
, the Observable returned by lift
will use the Observable prototype's _subscribe
, which is optimized for lifted Observables.
With my proposed change, there's no extra logic in custom Observables or Observable operators. RxJS's takeUntil is the class-based lift
equivalent to your closure-based example. If you take into account the fact that we don't need to manually dispose of the inner Subscription (because all Subscribers share the same underlying CompositeSubscription, which disposes of all inner subscriptions when the Observable completes/errors/is unsubscribed from), we have one less concern than the closure approach.
Readying a PR to ReactiveX/RxJS to compare the perf difference now.
@trxcllnt Thanks for the explanation. So under your proposal, subscribe
actually does return the unsubscription capability, which is different from what was mentioned earlier, namely that subscribe
would have no return value.
But you say that the subscriber function (what you're calling _subscribe) can no longer return anything useful. If so, then how does the subscriber function register it's cleanup routine?
new Observable(sink => {
// I can't return my cleanup action, so how do I specify it?
});
@zenparsing the subscriber function takes a Subscriber (not an Observer), which implements both Subscription and Observer. The subscriber function adds any Subscribers/unsubscribe functions to the subscriber
argument, e.g. subscriber.add(Subscription | unsubscribeFunction)
.
@zenparsing this works because the subscriber
argument also proxies adding/removing subscriptions to the shared underlying "composite" Subscription.
@trxcllnt Ah, that makes more sense. You register the cleanup action via "Subscription.prototype.add" and "subscribe" returns the subscription.
Still, this setup requires a significant increase in the conceptual complexity of the API. I'm still strongly opposed.
@zenparsing reasonable people can disagree about levels of conceptual complexity. we discovered this approach was much faster than supporting returning Subscription | unsubscribeFunction | void
from _subscribe
.
@trxcllnt What do you mean by "faster"?
I feel like I lost track of where this conversation was going and thought it was done. For some reason I thought you were going to go with objects not functions. Perhaps it was a conversation I had with @jhusain in the hallway.
Regardless of whether or not it's "faster", I strongly feel subscriptions should be represented as objects, not just functions. Objects represent that there is more to the unsubscribe
function than a simple call. The embody the idea that there is state underneath that call. They're also much more extensible than a function. Finally, I think the object should have an isUnsubscribed
property to reveal an important piece of that state.
If you have a subcription class, it could be subclassed or a duck-type could be made to support "composite" subscriptions. "Composite subscriptions" are an absolute necessity to developing many operators such as flatMap
, merge
, zip
, retryWhen
, etc. Basically an add
and remove
method (names could very) can easily be added to a subclass of subscription that maintains an inner collection of child subscriptions
to unsubscribe
from when unsubscribe()
is called on the composite.
Could you do this with functions? Sure, but it will be painful and inconsistently implemented. In some cases, you'll need to have your new Observable's subscriber function return a function that closes over two inner unsubscribe functions. In other cases you'll need to maintain an array, push
-ing and splice
-ing inner unsubscribe functions on and off of it, and returning from your new Observables subscriber function am unsubscribe function that loops over that array calling all of it's unsubscribe functions... and hopefully, all of those unsubscribe functions are implemented in such a way they don't allow themselves to be called twice.
This is handled much more cleanly by a Subscription class, which can be subclassed for use as a CompositeSubscription or even as a "Serial" Subscription for operators like switch
, where maintain only one inner subscription at a time, and dispose of the previous as new ones start.
zip
For example, if someone were to try to make a zip
operator with the current returned functions, it would be more difficult than it would with a subscription object that had a unsubscribe()
method and an isUnsubscribed
property on it. This is because in zip
the logic to complete the zip depends on checking to see if any one subscription is both unsubscribed and it's buffer is empty.
The expected outcome would be this:
(In the "marble" diagrams below, -
represents the passage of time, letters represent events, and |
represents a completion.)
O1: ----a------b-------c---d----|
O2: ------e-------f------------------g----h------i----j-|
R: ------(w)-----(x)----------------(y)--(z|)
where
let R = O1.zip(O2, (a, b) => a + b);
and
w = a + e
x = b + f
y = c + g
z = d + h
Notice that in the marbles above R
emits z
and completes (|
) in the same frame. This is because O1
's subscription isUnsubscribed
and it's buffer is empty. Take note that O1
's subscription is unsubscribed by the time g
is emitted from O2
, but it still has d
in the buffer, so R
is not complete. But as soon as O2
emits h
, R
can emit z
, and O1
's subscription is both isUnsubscribed
and the buffer is empty.
This matter was made a little more urgent by my working with the folks at Bacon and Most.js on getting the interop points integrated into all of our libraries. Also I think Kefir is doing the same. I was under the incorrect assumption that we had settled on Subscription objects being returned. I don't think they're under that assumption, but I strongly disagree with returning a function, and I'd hate for it to become a standard because of momentum.
@zenparsing one concern I have is that there has been some discussion in various quarters about adding an explicit resource disposal interface to JavaScript. Under the circumstances I would prefer if the result of subscribe were an object so that we can add the symbol to it later. Seems somewhat odd to add a symbol to a function retroactively.
On Sat, Sep 26, 2015 at 6:37 PM, Ben Lesh notifications@github.com wrote:
This matter was made a little more urgent by my working with the folks at Bacon and Most.js on getting the interop points integrated into all of our libraries. Also I think Kefir is doing the same. I was under the incorrect assumption that we had settled on Subscription objects being returned. I don't think they're under that assumption, but I strongly disagree with returning a function, and I'd hate for it to become a standard because of momentum.
— Reply to this email directly or view it on GitHub https://github.com/zenparsing/es-observable/issues/48#issuecomment-143511819 .
I prefer the object to the function - but while I'm not sure how likely resource disposal would be to ever happen; functions have Symbol.species, so I don't think there'd be any difference between what kind of object (function or not) was returned and sticking a symbol on it.
Yes, object vs. function is the wrong way to look at this. This is simply "should the returned object have a call behavior or not".
@blesh, not to pick on you, but just to drive home the point:
The embody the idea that there is state underneath that call.
This seems like a very Java-esque point of view. In JS, or any language with closures, functions are a great way to embody state. Indeed, objects are often just dumb property bags, whereas functions are much more likely to carry state with them.
They're also much more extensible than a function.
This is just not true; as I said, it's just a matter of a call behavior or not. Functions are objects.
Finally, I think the object should have an isUnsubscribed property to reveal an important piece of that state.
A function could have that. A bit unconventional, but not a big deal.
All that said, I don't think this is a big deal, and don't have a strong opinion. I just wanted to shift the perspective here, since a lot of the assumptions about functions vs. objects seem wrong here.
This seems like a very Java-esque point of view.
JavaScript is a language with a class
keyword, and implementations with slow closures and optimized classes. A language that is apparently imperative with functional features.
The assumption about objects vs functions is no assumption. It seems the language was designed around it. Promises could be implemented as just functions. They weren't.
@blesh Thanks for reviving this thread. I don't have much time to write today but I'll post my thoughts tomorrow.
Briefly, I think @domenic has a good point here. There are some places in Rx where method signatures are overloaded based on whether an argument is a function or object. But in my opinion, these overloads are questionable in the sense that functions are objects.
One such overload is the return type of the subscriber function. For convenience, we want to be able to return a cleanup function like this:
new Observable(sink => {
return _=> { console.log("This is my cleanup function") };
});
But we also want to allow convenient chaining, like this:
new Observable(sink => {
return otherObservable.subscribe(yaddaYadda);
});
The only way to get both, if subscribe
doesn't return a callable object, is to have the questionable overload semantics that I described above. It's unclean : )
However, if subscribe
returns a function, then everything just kinda fits together naturally.
I'm glad we have a concrete example in the form of zip
. I think we need to see a side-by-side comparison between an implementation that uses subscription objects versus returned functions to properly evaluate. I'll try to work on that tomorrow.
@zenparsing I'm pretty indifferent at this point as both seem relatively comparable functionally and performance wise (although, I'm still slightly in favor or not returning cancellation functions directly since IMO it's not a very familiar JS API type, and it's less explicit).
If you returned an object with bound functions you could just do:
new Observable(sink => {
return otherObservable.subscribe(yaddaYadda).dispose;
});
@benjamingr I doubt that we would get away with speccing it as a bound function. You'd have to do something like:
new Observable(sink => {
let s = otherObservable.subscribe(yaddaYadda);
return _=> s.dispose();
});
I doubt we could get away as speccing the return value as a function, but if you believe we can it might be worth it :)
new Observable(sink => {
return () => otherObservable.subscribe(yaddaYadda).dispose();
})
Isn't too bad, but I agree it's less clean.
In the current version,
subscribe
returns a cancellation function. Previously,subscribe
returned a "subscription" object with an "unsubscribe" method.Rationale:
Regarding 4:
We frequently want to "chain" observables like this:
If "subscribe" returns something other than a
() => void
function, then we have to bake-in some messy overloading in order to make that chaining nice and pleasant. RxJS makes use of overloading in several places (also "subscribe") which I would like to avoid.Thoughts?