Closed fghibellini closed 6 years ago
Also from the issue that introduced this behaviour (#60):
People need to learn functional programming and immutable state.
This breaks the well established behaviour that programmers learned to expect from JavaScript
Forcing observables to be async breaks a well established behaviour of observables. The fact that they can be either sync or async is a feature. It's trivial enough to wrap your sync observable in something that returns a promise downstream.
Callbacks and Promises always resolve asynchronously
Promises yes, but callbacks no.
let fn = cb => (console.log(1), cb())
fn(() => console.log(2))
console.log(3)
output:
1
2
3
Forcing observables to be async breaks a well established behaviour of observables. The fact that they can be either sync or async is a feature.
If it is so, then it is probably worth asking whether they are a good fit for JS. Adding a source of bugs in the language under the label "Feature" is probably not in the interest of anyone.
As for the callbacks, it is a consequence of the unfortunate usage of the term callback. It is used for 2 different purposes:
Promises also try to be compatible with the "async callbacks" - the first case is not really of interest in the current context.
Just consider why: setTimeout(() => console.log("a"))
is always processed asynchronously (it wasn't out of laziness).
The problem is that if you have a generic function that returns some observable and you cannot know whether it is synchronous or asynchronous then you basically have to assume an arbitrary execution order. And unpredictable order of execution and mutable data don't really get along well.
Also from the first line of the README:
The Observable type can be used to model push-based data sources such as DOM events, timer intervals, and sockets.
So it seems the goal of this proposal is IO handling too.
I’m pretty sure that DOM events require synchronous emission.
I assume the events you are talking about can be handled by registering a simple callback function too, and I'm pretty sure it will respect the run-to-completion rule - by EcmaScript standart.
@fghibellini function f() { while (true) {} } f(); console.log('something')
respects the run-to-completion rule just fine; Observables don't violate that rule here.
I'm relatively sure you can't handle DOM events properly unless the callback is synchronously invoked.
I don't get your first point.
But regarding the second one, I think there's a misunderstanding - I'm not saying that events should be handled asynchronously - that would not allow us for instance to call e.preventDefault()
. I totally agree with you on that one.
My complaint is regarding the interruption of the code flow when registering the handler.
Please see the following snippet - you can see an example implementation that handles events properly but the callback passed to .subscribe()
is called only after the whole initialization code has finished running. Thus the output at the beginning is always 41 followed by a 42 and never the other way around.
@fghibellini Originally I had Observable.of
and Observable.from
(when it takes in iterable) emit values in a new turn of the event loop to deal with just the issue you bring up. Unfortunately that was not acceptable to the other participants in the design process.
@zenparsing Yes, I know - that's why I mentioned task #60 and others above. Unfortunately I'm not familiar with the proposals' lifecycle and what people are involved in the acceptance problem. But I'm pretty sure that the above flaw should not allow the proposal to pass unless it's addressed. The JavaScript execution model (including the run-to-completion-semantics) is the single most distinct feature of the language, and is often described as beginner friendly as it prevents code concurrency unless it is explicitly introduced by the programmer (like in this case).
@fghibellini I agree with you that the synchronous dispatch that occurs with of
and from
is problematic from a language-level point of view. In order to fix it you'd need to make a couple of changes (if I remember correctly):
of
and from
dispatch in a microtask (as I originally had it).subscribe
to the observer's error
callback.Even with those changes in place, there's still an open question: can custom Observables created with the Observable constructor send data to the observer before "subscribe" completes?
new Observable(sink => {
sink.next(1);
}).subscribe(data => {
console.log(data);
});
Should this be allowed? Should it be disallowed? What do you think?
You could place the observer in an "uninitialized" state and throw if the producer tries to call methods on it. But I thought that would be unnecessarily frustrating for users. Furthermore, sending data "on subscription" isn't all that uncommon in the wild.
Another option would be calling the subscriber function itself on a new turn. But that would make Observable unsuitable for some other use cases (e.g. wrapping addEventListener
or, more generally, listening for events that occur sometime before the next turn).
@fghibellini You've actually piqued my curiosity about this!
Let's say the SubscriptionObserver has an initialization state. It is uninitialized during the call to the subscriber function and it is initialized after the call successfully completes. If the producer attempts to send data to the observer while it is uninitialized an error is thrown.
If an error occurs during a call to the subscriber function, a job is enqueued to call error
on the observer (if it has one).
And then of course we fix of
and from
to enqueue a job rather than send immediately.
With this design you get the sequencing guarantee you are looking for while still fully supporting the synchronous dispatch (DOM EventTarget) use case.
Another benefit is that the start
callback becomes unnecessary. Also, it becomes much easier to write combinators properly since they don't have to deal with the edge case of early dispatch.
The primary downside is that users that want to immediately dispatch available data will have to enqueue a job, and deal with the edge case where they might want to dispatch more data before that time (they need to be careful not to have anything out-of-sequence).
A Promise
executor is called synchronously and an Observable values are emitted synchronously too. That's consistent with subscribe
behaviour.
@zenparsing I really don't think Observables - not even custom ones - should be allowed to push data during subscription.
I will explain my understanding of the proposal:
document.addEventListener("keyup", (evt, err) => {
if (err) {
handleError(err);
} else {
handleSuccess(err);
}
});
observableSubscribe(document, "keyup")
.subscribe(handleSuccess, handleError);
The above two snippets express the same behaviour (or should).
What Observables bring to the table is an interface to callbacks very similar to how Haskell handles these kind of problems (not surprising given that if I'm not wrong the inspiration went Haskell => LINQ => Rx).
Let's for now consider only Observable.map()
which offers the same behaviour as the functor instance of any IO type in Haskell.
const f = (evt) => evt.charCode + 1; // function that we want to apply on the event
// with callbacks
document.addEventListener("keyup", (evt, err) => {
if (err) {
handleError(err);
} else {
handleSuccess(f(err));
}
});
observableSubscribe(document, "keyup")
.map(f)
.subscribe(handleSuccess, handleError);
I agree that the second one is more compositional, but you can see that one is just syntactic sugar on top of the other - and it should be that way!
My whole point is that if you're bringing a heavily functional interface to JavaScript you should probably consult with someone who has a good understanding of both functional programming and JavaScript - i.e. not me and not guys from big companies with mottos like "move fast, break things". I think the people with the best background might be folks that have written transpilers from Haskell to JavaScript, or that have worked on one of the functional languages for JS (elm, purescript,...). It might also be in the best of their interest to not further complicate the language.
I'm currently experimenting with this approach (not allowing early dispatch) in zen-observable and liking it a lot so far. It composes so much better when you don't have to account for early dispatch.
I am sure it will!
Just as a side note - in Haskell all the functor instances must satisfy some laws that basically say that using .map()
only applies the function to the resolution value but doesn't affect the rest of the computation (e.g. how tasks are scheduled) in any way - I think similar laws will necessarily have to make it into this proposal too if you want to have a well behaved system.
The Haskell folks have poured years of research and tinkering into making those interfaces really beautiful, that's why I think consulting with them might be a really good idea.
Well, if I followed up the conversation correctly, this timing and ordering of the emissions remembers me an issue on Cycle.js from some time ago.
Basically, when an HTTP request was made, the DOM driver (an abstraction over DOM API) was not ready yet, and some messages was lost. It was solved by adding an event buffer, guaranteeing that all events emitted during the Observables connections (subscribes) wasn't lost.
Xstream, the observable-like lib written for Cycle.js, have this property "sync start, async stop", that simplifies a lot the reasoning for programmers writing the app, but have some issues with these kind of corner cases.
Abstraction over DOM API was not ready yet
as far as I know the DOM API doesn't have an initialisation period and thus any abstraction over it shouldn't too.
@fghibellini That was a Cycle.js contextualization. It uses the so called "DOM Driver", that abstract DOM API interactions from the app logic itself.
E.g: You don't call addEventListener
on any node. You need to DOMDriver.select("querySelector argument")
and then .events("click")
in order to get a stream (observable) of click events from the selected node.
@geovanisouza92 I'm sorry but I don't think that hacky implementations of JS frameworks should have a place in this discussion (simply because it would get us nowhere - there's just too many of them). Language features should be based on formally described patterns and not code that is the result of trial&error-driven development.
@zenparsing I will try to take a look at the TC39 acceptance process, to understand how it all works. But I've seen that this proposal is flagged as 🚀in the tc39 proposal list. As you yourself said
It composes so much better when you don't have to account for early dispatch.
the difference is significant and I really think that the behaviour incompatible with the JS execution model would be the first thing that the committee would object. Please get someone authoritative to take a look at this.
Erik Meijer, @headinthebox, has stated a number of times that observables were created to and designed for handling the asynchronous plural case. For example, see this video where Erik explains Rx in 15 minutes. Jump to 5:23 to hear how it came from the desire to build distributed cloud-based applications (async computations).
@Frikki I am well aware of Meijer's talks on Observables. I don't understand what are you trying to say though.
@fghibellini
@alex-wilmer wrote:
Forcing observables to be async breaks a well established behaviour of observables.
This doesn't resonate with anything I know about Observables. Hence, my comment.
@fghibellini
Please get someone authoritative to take a look at this.
Perhaps the current champion(s) will chime in?
In the meantime, I'm experimenting with these ideas (and other improvements) over in https://github.com/zenparsing/zen-observable . Feel free to review and comment over there as well.
@fghibellini you don't get the point. I was trying to bring some use cases that contextualize the need for delayed dispatch, not trying to bring frameworks to the conversation. Cycle.js is basically an architectural pattern and syntactic sugar to separate logic from side effects, so, I think that's a good example of applied Observables.
The original post, the linked Cycle.js issue, the async behavior of Promises, this whole discussion really, is about scheduling: https://staltz.com/primer-on-rxjs-schedulers.html in other words, disambiguating execution order (mostly breadth-first versus depth-first).
Might make sense to consider scheduling options (or the pros/cons of having only one scheduler, like Promises) in this proposal.
@geovanisouza92 I must apologise for dismissing your point without a better explanation. I agree with you that existing usages of Observables should be taken into account, but I think it would be better if the examined behaviour could be first extracted into some simplified example code instead of linking lengthy issues requiring people to understand the framework. Also I'm not experienced in this kind of discussions, so feel free to correct me if I'm wrong.
@staltz I absolutely agree with you. Schedulers is probably the main reason why I previously used the word "hacky". I don't think they are a feature that should make it into the language. Your proposed solution is basically what I was proposing and what @zenparsing is examining now - which is basically not having schedulers (same as having just one) but having consistent behaviour across the language in terms of IO handling. This is just my opinion though and I hope someone more qualified gets to take a look at this.
This breaks the well established behaviour that programmers learned to expect from JavaScript run-to-completion-semantics.
This isn't consistent or true in JavaScript at all, not even in standard APIs.
Example of where people don't expect "run to completion semantics", Observable's cousin the Promise has a constructor that accepts a function that is synchronously executed:
console.log('start');
const myPromise = new Promise((resolve) => {
console.log('doing promisey things');
resolve('done');
});
console.log('after promise is created');
/*
NOTE: we've already logged
"start"
"doing promisey things"
"after promise is created"
*/
console.log('inconsistency demonstration');
myPromise.then(x => console.log(x));
console.log('is almost');
/*
LOG:
"inconsistency demonstration"
"is almost"
"done"
*/
... that's entirely contrary to section the document you linked, which I wouldn't put much faith in.
Also, Observable has to support synchronous dispatch or it's going to be gimped as a primitive. For example, it will never be able to represent EventTarget, which can be setup and dispatched synchronously:
const handler = () => console.log('data received');
console.log('setting up');
eventTarget.addEventListener('data', handler);
console.log('dispatching event');
eventTarget.dispatchEvent(new Event('data'));
console.log('tearing down');
eventTarget.removeEventListener('data', handler);
/*
LOGS:
"setting up"
"dispatching event"
"data received"
"tearing down"
*/
Schedulers is probably the main reason why I previously used the word "hacky". I don't think they are a feature that should make it into the language.
Schedulers already have made it into the language in one form or another. setTimeout
for example.
If you force async, there's less you can model with observables as a primitive, it forgoes using Observable for behaviors for similar reasons promise makes things difficult at times (e.g. you want to use a primitive for a cancellation semantic that prevents execution? Too bad, this primitive is going to force you to wait a turn)
You can always opt in to asynchronous behavior You can't really opt out once it's forced.
Forced async makes this proposal DOA, in my opinion.
@benlesh
I think we need to be very careful that we're all speaking the same language. Nobody is arguing for a primitive that cannot model EventTarget.
Absolute requirements:
I think we all agree on these requirements.
The more fuzzy question is: do we allow the producer to send data to the consumer before subscribe has returned? It is possible to disallow this behavior without sacrificing the two requirements above.
By example:
new Observable(sink => {
sink.next(); // ERROR! subscription not initialized
}).subscribe();
Implementation wise, you put the subscription observer into an "initializing" state, you don't allow notifications while the subscription is in that state, and you only mark it as "initialized" after the subscriber function has returned.
Clearly we can model EventTarget with such an abstraction.
The obvious disadvantage of such an approach is that you force some abstractions (e.g. of
) to notify in a new job (with a new stack).
The advantage of this approach is that
I'm curious: what kind of abstractions would be impossible with such a restriction? Can we list the use cases that it would eliminate?
@zenparsing a typical situation that comes up off the top of my head would be when you want to synchronously produce N values, where N is indeterminate. Observable.range(0, Number.POSITIVE_INFINITY).take(Math.round(Math.random() * 100))
If you don't notify during subscription, the take
has no means of counting and unsubscribing.
We can get rid of the hacky "start" method.
An alternative to this is just supplying the subscription as a second argument to next
.
@benlesh I'm not sure I understand completely the issue. Could you explain a little more what the problem is?
This got me thinking about take
, though.
Here's the take
that I would expect a user to write:
function take(source, n) {
if (n <= 0) {
return Observable.from([]);
}
return new Observable(sink => {
let count = 0;
return source.subscribe({
next(v) {
sink.next(v);
if (++count === n) {
sink.complete();
}
},
error(e) { sink.error(e) },
complete() { sink.complete() },
});
});
}
Under my proposal here, this would be correct. But with the current Observable spec, it would be wrong. You'd have to do something like this to deal with an "eager producer":
function take(source, n) {
if (n <= 0) {
return Observable.from([]);
}
return new Observable(sink => {
let count = 0;
return source.subscribe({
_subscription: null,
start(s) {
this._subscription = s;
},
next(v) {
sink.next(v);
if (++count === n) {
sink.complete();
this._subscription.unsubscribe();
}
},
error(e) { sink.error(e) },
complete() { sink.complete() },
});
});
}
It's tricky and subtle and non-compositional. We ought to be able to compose by simply returning the result of source.subscribe
and have the inner subscription automatically close when we complete
our consumer. But we can't. We have to grab an early reference to the subscription and remember to also call unsubscribe
on it.
This is really the crux of the argument: that allowing "eager producers" breaks compositionality.
Composition only breaks if you blindly assume all Observables are asynchronous. That's what promises are for.. to create an async contract. Observables are more lower level than that.. they're just functions that can "return" multiple times. Just like a function, it's not assumed to be asynchronous,
Maybe its clearer to say that the whole lifecycle of an observable should be allowed to occur in a single tick. That's a much better primitive to build on, as was said before.
@benlesh I think I understand what you were saying now. Here's how I would implement such a thing with the semantics I'm proposing:
https://gist.github.com/zenparsing/89f3a1ae89e996e145afcde0e8ade96b
It works, I think.
@zenparsing Yeah, If you're using an iterable, you're fine, sure... because you can offset nexting from that iterator until the subscriber function is complete. However, if you implement this the most straightforward way, in the subscriber function, there's no mechanism to stop it if all emissions are buffered until the subscriber function returns. Obviously a loop to infinity isn't something most people would do (although I've seen it in a surprising number of apps, myself), but imagine you are caching several thousand rows you're going to loop over and you only want to take a few. Are we going to force everyone to use an iterator?
The other side of this is Observable's "duality" with Iterable. Iterable has a mechanism for synchronously "stopping": don't next. With what you're proposing Observable will only have a mechanism for synchronously stopping if it's used in a specific way with an iteratable.
What are we trying to fix? Is something broken about observables as they currently exist in the wild? In practice I've never seen what is being proposed here to be necessary at all, nor have I seen any bugs related to it. The sync/async behavior of Observables is well established and it suits its (primarily functional) use cases. Forcing async behavior will add limitations to how you can use observable, so if we're going to force this, hopefully it's fixing something that is horribly broken.
Forcing async behavior will add limitations to how you can use observable, so if we're going to force this, hopefully it's fixing something that is horribly broken.
It's definitely not "horribly" broken. It works pretty well. But the current maybe sync, maybe async behavior is demonstrably problematic, and as the OP points out, it doesn't match up well with JavaScript's stricter "async or sync but not both" model. I think it's worth exploring whether we are better off with an async-only push stream primitive.
I think @fghibellini said it best:
The problem is that if you have a generic function that returns some observable and you cannot know whether it is synchronous or asynchronous then you basically have to assume an arbitrary execution order. And unpredictable order of execution and mutable data don't really get along well.
Maybe sync, maybe async behavior is a long running standard in the observable world, and you'd be breaking all conventions and documentation by forcing async behavior. I think it would be a questionable "win" at the cost of potential memory leaks and GC issues due to the buffering behavior it would have to have. I highly recommend sticking to what's tried and true.
@benlesh
Example of where people don't expect "run to completion semantics", Observable's cousin the Promise has a constructor that accepts a function that is synchronously executed:
console.log('start'); const myPromise = new Promise((resolve) => { console.log('doing promisey things'); resolve('done'); }); console.log('after promise is created'); /* NOTE: we've already logged "start" "doing promisey things" "done" */
You are not logging "done" anywhere so it should definitely not be in the output. Yes the executor (the function passed to the Promise constructor) is ran synchronously, BUT the value resolves asynchronously. e.g.:
console.log("start");
new Promise(resolve => {
console.log('doing promisey things');
resolve("done");
})
.then(x => console.log(x));
console.log('after promise is created');
outputs:
start
doing promisey things
after promise is created
done
from the main page of this proposal:
The Observable type can be used to model push-based data sources such as DOM events, timer intervals, and sockets.
And here we are discussing processing of infinite sequences that has nothing to do with events. As I mentioned before it would seem that Observables are simply events just like JavaScript always had (or at least as far as the younger of us can remember) only with an interface that allows to apply transformations on them. If this is true, you're implementing something that has already been well described and implemented in other languages (in a better way) only under a different name - this would also mean that there's a far more established behaviour than the one hacked up by JavaScript hackers.
Maybe sync, maybe async behaviour is a long running standard in the observable world, and you'd be breaking all conventions and documentation by forcing async behaviour.
I would really appreciate if people stopped using the words standard and Observables together as this is the proposal for standardisation of Observables. Every single aspect of this feature should be evaluated (@zenparsing makes a good point why).
I highly recommend sticking to what's tried and true.
I highly recommend you trying out C I heard it's very tried and true.
@zenparsing I took a look at your implementation, unfortunately I will not have time until the weekend to send you proper feedback with example code.
In short though - I find the behaviour of Observable.of(...)
problematic when called with multiple elements. I think .of()
should just take a value and put it into the context of an event (very much like pure
works in haskell) and it doesn't really make sense for more than one value - I would leave that to Observable.from(array)
, which in existing languages is implemented in terms of flatMap()
- and the semantics goes something like - create an event from the first value and no matter how it's handled create and event from the second one.
Also regarding .dispatchEvent()
- my proposal will work in all cases when promises would (promises spec) as long as the code that calls .dispatchEvent()
can be considered "platform code" i.e. there's no other logic running above it in the stack. The rule is not in the spec for no reason.
FWIW: I'm only 90% against this change. I was thinking about it and using some a queue scheduler would accomplish similar behavior in RxJS, and past implementations have defaulted to that, but not at zero cost. Speed goes down, and implementation complexity goes way up. Behavior obviously changes to breadth first instead of depth first, and forcing this sort of scheduling would be hard to do on people trying to build custom observables with the constructor I suspect.
@fghibellini
You are not logging "done" anywhere so it should definitely not be in the output.
Thanks, it was a typo. I've fixed it.
I highly recommend you trying out C I heard it's very tried and true.
I have no idea what this is. Some sort of dig? I'm not even sure what it means. But this has gotten adversarial, at least on one end, so I'm out.
Unsubbing from thread, please don't @. Sorry, @zenparsing, it was a solid discussion.
I highly recommend you trying out C I heard it's very tried and true.
I have no idea what this is. Some sort of dig? I'm not even sure what it means. But this has gotten adversarial, at least on one end, so I'm out.
I was trying to say that just because a feature has been widely used doesn't mean it's a good feature - C being an example of a very widely spread language but that doesn't aid the programmer too much in writing correct code. The phrasing was inappropriate and I'm really sorry about that.
I'd love for observables to have microtask-deferred semantics but I thought we established long ago that it simply doesn't work with plural events well.
That is, you need to be able to subscribe to events in time. That is, if you're making an API that has its own delay-with-microtask semantics you need to be able to "run before it" to "catch the event in time".
Let's say you're listening on HTTP requests and then want to read the body off an observable. The body might dispatch the initial bytes deferred by a microtask - and you have to wait a microtask to subscribe to it - you're pretty much guaranteed to miss an event.
When I tried implementing observables like this - I failed. Schedulers are nice but the capability is what observables need to have for this.
For what it's worth - promises could benefit a lot from schedulers too - capability wise and for instrumentation.
@benjamingr
Let's say you're listening on HTTP requests and then want to read the body off an observable. The body might dispatch the initial bytes deferred by a microtask - and you have to wait a microtask to subscribe to it - you're pretty much guaranteed to miss an event.
What you are describing is asynchronous subscription, and you're right that we cannot do that.
What we're exploring here is different: we are suggesting disallowing the emission of data from within the subscriber function.
For example:
new Observable(observer => {
observer.next(); // Throws an "Observer not ready" error
});
This restriction makes the protocol much simpler: consumers don't have to protectively program against the sync-dispatch case and we can eliminate that silly "start" method.
The downside is that observables which emit in-memory data on subscription (like Rx's BehaviorSubject and ReplaySubject) have to change their behavior somewhat.
Does that make sense?
Won't it be an issue for any operator emitting Observables? Something like rxjs' groupBy
will have to buffer the first value of each group I am guessing, a value that might need to be emitted synchronously.
@dinoboff Thanks for bringing up groupBy
. If I understand correctly, groupBy
will work with these restrictions. It is implemented on top of Subject, which does not emit data within the call to subscribe
.
Observable
.of('a', 'b', 'aa', 'bb', 'aaa')
.groupBy(v => v[0]) // Group by first letter
.subscribe(inner => {
let subscription = inner.subscribe(v => {
// This will never be called before "subscribe" returns
});
});
As far as I can tell, only BehaviorSubject and ReplaySubject (and the operators based upon them) are affected: they must send their cached data in a job.
Thanks for the clarification @zenparsing - what if next
emits an observable we have to subscribe to?
@benjamingr Yes, that would be the groupBy
use case. I think that works fine, as long as the inner observable does not emit a value from within the call to its subscribe
.
Here's a (somewhat) simple example:
// The outer observable emits an observable every second
var outer = new Observable(outerSink => {
let interval = setInterval(() => {
let innerSink;
// The inner observable emits the current timestamp and then stops
outerSink.next(new Observable(sink => { innerSink = sink }));
// Data is sent only after the user has had a chance to subscribe
innerSink.next(Date.now());
innerSink.complete();
}, 1000);
return () => {
outerSink.complete();
clearInterval(interval);
};
});
outer.subscribe(inner => {
let subscription = inner.subscribe(value => {
// This will never get run before "subscribe" returns
console.log(value);
});
});
Got it, the inner observable has to defer its next
s too so we won't be subscribing to it too late since microtick order is guaranteed. Thanks for clarifying.
In this spec some observables emit synchronously (e.g.
Observable.from()
). This breaks the well established behaviour that programmers learned to expect from JavaScript run-to-completion-semantics. Callbacks and Promises always resolve asynchronously (promises-are-always-aynchronous). This is not by accident and greatly simplifies the reasoning needed to understand a program. citing MDN - Event loop:Promises
Possible executions:
Observables
Possible executions:
In other words the control flow graph of the first example is simpler, and the complexity of the second case increases with the number of async operations performed during the first cycle. (also simpler CFGs enable compilers to better analyze and in turn optimize code)
In #49 @zenparsing says: