repeaterjs / repeater

The missing constructor for creating safe async iterators
https://repeater.js.org
MIT License
459 stars 12 forks source link

Question: observables and callback hell #33

Closed brucou closed 4 years ago

brucou commented 4 years ago

I am reviewing the library, and running thruogh the documentation, I came across this:

Being a callback-based API makes using observables with async/await and promises awkward; in fact, observables suffer from the same issue of “callback hell” which promises were designed to solve. Observable libraries are aware of this and provide “higher-order observable operators,” which work on observables of observables, but these solutions are seldom used and virtually incomprehensible to human beings, who are unaccustomed to thinking in extradimensional spaces.

I have been using observables for a while and I wonder what you meant by callback hell occurring also with observables. Do you have an example? If by higher-order observable operators, you meant flatMap or switchMap and the likes, I have actually been using those pretty often, and I would disagree that they are incomprehensible to human beings. That is quite a strong statement.

Aside from developer experience, I guess my next question is: is there something that you can write with repeaters that you cannot with observables? Or to say the same, are repeaters more expressive than observables? I can think about backpressure as a possible candidate, but I haven't been able to figure the answer from the documentation to figure that out so asking here.

Thanks!!

brainkim commented 4 years ago

Perhaps I was being glib when I wrote that 😬. I was thinking of the RXJS operators which work on observables of observables and how incomprehensible they seemed (to me). These functions are often accompanied with wild marble diagrams like the following: insane marble diagram 1 and I have yet to see a production codebase which actually uses observables of observables in a meaningful way.

These operators are trying to solve the problem of callback hell, which is when code with callbacks tends to expand into a pyramid-like structure, except this time it’s calls to pipe and subscribe which have to be flattened. I’m sure RXJS experts have got this figured out, but I bristle at the idea of using a function to concatenate two observables together, when with async iterators, concatenation is just iteration in an inner loop:

for (cons iter of iters) {
  for await (const value of iter) {
    console.log(value);
  }
}

And even when I was working with a single observable, I was never able to remember which of switchMap, flatMap, mergeMap, concatMap, etc. I needed in a specific situation. You can say that these operators solve the problem of callback hell, but I don’t see them being as effective as async/await and the yield* operators, if only because async/await/yield are actual language features.

Is there something that you can write with repeaters that you cannot with observables?

Let me do you one better and point out something you can write with observables that you cannot with async iterators: you cannot synchronously dispatch updates using async iterators. For instance, when you subscribe to a BehaviorSubject, you can be assured that when you call next on the subject, all subscription functions will fire synchronously:

const subject = new BehaviorSubject(0);
subject.subscribe(console.log);
subject.next(1);
console.log("a");
subject.next(2);
console.log("b");
// => 1, "a", 2, "b"

Callback-based APIs have the advantage that the callback can be run synchronously when data updates, whereas async iterators are asynchronous on both ends, so that if you have a shared mutable data structure like the DOM, its state can change between when you push data into the async iterator and when you pull data from the iterator. In this way, async iterators are bad at describing the “current” state of some shared data, and I‘ve run into race conditions when trying to use async iterators in this manner.

However, if you need exactly synchronous dispatch, it’s my belief that observables are too high-level an abstraction anyways, and you’re much better off using something like an EventTarget with dispatchEvent or something like a MutationObserver where you can call takeRecords when you need to. Observables are stuck in a middle-ground where their high-level use-cases are better solved by async iterators because async iterators have language-level support, and low-level use-cases are better solved by one of the many other callback-based APIs which are already available in javascript.

I can think about backpressure as a possible candidate

I think you are correct in pointing out that backpressure is an important feature of async iterators which observables are unable to replicate. Most observable solutions do not solve the fundamental problem of backpressure, i.e. what to do when producers produce values faster than consumers consume them, and simply defer the problem further downstream to later consumers. You can think of observables and async iterators as a series of connected pipes, and all observables do to deal with backpressure is to let a certain pipe or joint leak, or hang a bucket from a specific pipe. This means that if the flow is too strong, there still can be problems further down the pipeline. Async iterators can emulate the solutions observables provide but also can do something observables cannot, which is to slow down producers, which in the above analogy would be to tighten the spigot which leads into the house or something.

This stackoverflow answer goes into more detail about the differences between observables and async iterators, and I encourage you read it if you haven’t already. Essentially, repeaters are the glue code which converts “push-based” APIs to “pull-based” async iterator API.

Also, you say “aside from developer experience,” but I’ve found that async iterators are just fun to work with, producing elegant and readable code. Give async iterators (and maybe repeaters 😇) a try for a week in a small, isolated part of your codebase. I guarantee the code which you create will be easier to read, maintain and explain. I look forward to finding new patterns (and perhaps even frameworks!) based on async iteration, because I think it’s one of the coolest and underused features of modern javascript.

brucou commented 4 years ago

Wow, thanks for the long answer. I understand your points, but some quick comments.

async/await/yield are actual language features.

Async iterators are still Stage 3 I think, so you actually have to compile it down so it is the same as having to use a library for some browsers, though admittedly that will go evolving over time.

concatenation is just iteration in an inner loop:

your code example actually have the same identation that you would have when using observables of observables.

I guess a lot of your arguments come down to taste and familiarity. I got pretty comfortable with observables over the years, so I can't say it is a problem for me, and I actually find it pretty readable, but that is because of my familiarity with it, not necessarily because it is easy or simpler or else. It is always the same, once you properly learnt something, it looks easy. I had trouble with generators when they came out, and that also went down with time, but I haven't used async generators enough so that it is so intuitive to me. It is indeed a pretty versatile tool. I guess if you are a fan of functional programming and dataflow programming, observables are not so bad. I mean this is the same as iterating an array with a for loop, vs. using the Array.map function. Both work, but I would argue in favor of the map function for readability.

But the backpressure thing is real. You have some operators dealing with backpressure in Rxjs, like throttle (lossy backpressure), you had a pause operator, and you can also have lossless backpressure but generally with unbounded buffers (!), but I believe as of now, if you want to have sliding buffers, you have to do that yourself but I am not sure. In Rxjs v4, there was an operator for just about everything, they brought that down in the subsequent versions. In any case, the fact is generic backpressure (where you imperatively push and pull) is indeed a little bit complex with just standard Rxjs. Because Rxjs supports declarative programming (dataflow programming is declarative), you have to define your backpressure strategy ahead of time. But again, you get all the advantages of a declarative programming paradigm.

Then another common issue with dataflow programming and I think the issue will exist too for async iterators is to wire back the dataflow on itself, i.e. having circular dataflows. That can be done manually in Rxjs with subjects, or you can use a library (framework) like cycle.js to do that for you.

In short, with my own experience, I see real value in the ease with which you handle backpressure with async iterators, but otherwise I think it really comes down to what you already know or master and what you find hard.

brainkim commented 4 years ago

Async iterators are still Stage 3 I think

They’re Stage 4 and officially part of the language spec (https://github.com/tc39/proposal-async-iteration/issues/131).

your code examples actually have the same indentation that you would have when using observables of observables

Sure, but how “hellish” your code becomes is not based on the total levels of indentation but how easy it is to split up your code into manageable functions/methods. A careful node.js programmer can mitigate callback hell by creating intermediate functions using the callback pattern, but it’s difficult to name these functions and it requires a lot more effort than equivalent promise code. Similarly, I think observable-based code suffer from a lack of composeability as well, where you’ll see long pipe chains whose individual parts are difficult to name or abstract. I wish I could show you actually code examples, but I don’t use observables anymore and I’m finding that most non-trivial observable examples and tutorials are now behind paywalls.

I think async iterators naturally flatten out because of three features which observables lack:

  1. for await loops evaluate to a promise which settles when the loop finishes.
  2. for await loops and async generators/repeaters can contain await operators.
  3. async generators provide yield and yield * operators to allow delegation of iteration from within loops.

These three features align async iteration with promises in a way that I haven’t seen for observables. For instance, if we’re in an async generator, the above concat example can be further reduced to:

for (const iter of iters) {
  yield *iter;
}

The ease with which you can reduce the iteration of an async iterator to a promise, or abstract a sequence of promise calls as an iterator, allows you to more easily combine or separate chunks of code into coherent parts. Compare this to observables whose functions either return more observables (pipe), or subscription objects (subscribe), the latter which are not really composeable but merely defined for cleanup purposes. Promises themselves are nowhere to be seen. This is why I always find it somewhat amusing when observable tutorials include diagrams like the following: 1*7ZdFWFlA9dSRCv2naCjihA The authors of observable articles do elaborate mental contortions to shoehorn observables into the bottom right corner, when the most analogous data structure is undisputedly async iterators.

I guess a lot of your arguments come down to taste and familiarity.

I should probably be magnanimous and concede this but I think that observables are objectively harder to understand than async iterators. When you use async iterators you can use break, if and mutate variables in scope freely; in other words, all the skills you’ve developed from working with synchronous for- and while- loops transfer over, and you don‘t have to reason about function closures. One overarching theme I’ve seen with es2018+ is that I think we should begin moving away from APIs which use callbacks when the code could be equivalently written with promises, iteration or async iteration. So this means using for of instead of forEach, async iterators instead of observables, etc.

The reason callbacks are inherently more difficult to understand is that there are more unknowns related to call expressions which take functions than with their callback-less equivalents. We often talk about the cyclomatic complexity of code, where the complexity of a piece of code can be objectively determined by how many if/switch/while/for blocks there are in the code. I think callback-based APIs are like for loops on steroids, because without knowing the underlying details of the API there are way more unknowns about the code.

For instance, the following code:

iif(value, () => {
  console.log("hello world");
});

is objectively more complex than:

if (value) {
  console.log("hello world");
}

even if iif and if behave the same in all cases. Similarly, you could imagine a definition of Array.prototype.map which runs in reverse-order over the array, and you wouldn’t know until the first time you put a side-effect in the callback which depended on iteration order. Callback-based APIs are powerful, and require drilling-down into their implementations to fully understand, whereas statements are constant throughout the language.

For these reason, I try to avoid callbacks except when I feel they’re truly necessary, and I think that iterators and promises have the ability to render much of what we do with callbacks unnecessary.

Sorry for the rant. In terms of the actionability of this issue, I think you’re right that saying observables suffer from “callback hell” is maybe a bit of a reach, and we can distinguish between observables and repeaters in a way that is more clear and honest. I’ll try to work on the docs a bit when I get the chance.

brainkim commented 4 years ago

Then another common issue with dataflow programming and I think the issue will exist too for async iterators is to wire back the dataflow on itself, i.e. having circular dataflows.

I’m really curious about this use-case, and would like to learn more if you have any examples.

brucou commented 4 years ago

observables are objectively harder to understand than async iterators. When you use async iterators you can use break, if and mutate variables in scope freely; in other words, all the skills you’ve developed from working with synchronous for- and while- loops transfer over, and you don‘t have to reason about function closures. etc.

This is basically saying that imperative programming is objectively easier than functional programming, given that you already know about imperative programming.

I’m really curious about this use-case, and would like to learn more if you have any examples.

Don't have from the bottom of my head a meaningful example, but a toy example would be:

a -> b -> c -> d
  ^         |
  --------- |

where a is some data stream, transformed into b etc. What happens here is that the computed c is fed back into the operation that transforms a into b. So if I call that operation f, then b = f(a+c) where + here is a merge of streams, then if we define g so that c = g(b), we have b = f(a + g(b)) which is a recursive definition, which has to be handled properly.

brainkim commented 4 years ago

This is basically saying that functional programming is objectively harder than imperative programming, given that you already know about imperative programming.

I don’t think I made this argument! To be clear, I am a fan of functional programming and techniques like recursion, immutable data structures and referential transparency. The claim I’m making is much more specific, that:

In the context of modern javascript, between two equivalent code snippets, one which uses statements/operators and one which uses callbacks, the code which uses statements/operators is necessarily easier to understand.

This is because you don’t have to drill down into the implementation of statements and operators; once you figure out the semantics of the await operator, you know how it works in all situations. So to extrapolate to our current discussion, yes it’s true that equivalent code can be written with both observables and async iterators, but it’s not simply a matter of developer familiarity which makes async iterators easier to learn.

The real reason is that 1. once you learn the semantics of for await you can apply it to all async iterators and 2. async iterator statements play nicely with other statements/operators like async/await/if/continue/break/yield etc. Compare this with observables, where the subscribe method can have different behavior based on whether the subscription is a ReplaySubject or a BehaviorSubject, and where RXjs defines a suite of “operators” like iif and takeWhile to replicate the behavior of if and break statements for observables.

In any case, if you’re happy with observables, I think you should definitely keep using them. They are mature, highly performant, well-specified, and will likely become a part of the language at some point. I just think that async iterators are a happy accident in javascript where we extrapolated from sync iterators and generators and got a data structure which can describe virtually any sort of computation or side-effect.

brucou commented 4 years ago

In any case, if you’re happy with observables, I think you should definitely keep using them. They are mature, highly performant, well-specified, and will likely become a part of the language at some point. I just think that async iterators are a happy accident in javascript where we extrapolated from sync iterators and generators and got a data structure which can describe virtually any sort of computation or side-effect.

Well, async iterators are also well specified I believe, and performant, and mature (I mean the concept is), and ahead of observables when it comes to inclusion in the JS standard. My question really is about understanding what one can do and the other cannot. Having two tools instead of one is not an issue, but having only one makes it useful to know that unique one. I am definitely attracted by the ease with which lossless backpressure can be handled with iterators and that is not because they are imperative or simpler (or spatial vs. temporal) but because they pull data instead of pushing data (like Rxjs observables), so you have a natural control over the producer. Lossless backpressure in push-based streams usually require unbounded buffering, which is not always an option. By the way, I recently discovered that Rxjs can implement lossless backpressure without resorting to unbounded buffering... thanks to iterators :-) Cf. https://itnext.io/lossless-backpressure-in-rxjs-b6de30a1b6d4 . Basically a pull behaviour is replicated with two pushes.

Well anyways, thanks for the time spent addressing my questions, and thanks for publishing this library. I will mention it in one of my upcoming articles as an option for handling backpressure smoothly with iterators. You may close this is you want, I am not doing it because I see that you assigned it to yourself so maybe you prefer to keep it open.

brainkim commented 4 years ago

Rephrased the docs to avoid mentioning callback hell. Closing but feel free to continue discussion.

brucou commented 4 years ago

The article about backpressure which mentions repeaters.js: https://www.infoq.com/news/2019/10/reactiveconf-2019-backpressure/

brainkim commented 4 years ago

Thanks for the mention. Yeah backpressure is crucial for creating robust systems the more I think about it.