Provide a justification for not using Observable

tc39 / proposal-cancellation

Proposal for a Cancellation API for ECMAScript

https://tc39.github.io/proposal-cancellation

BSD 3-Clause "New" or "Revised" License

266 stars 12 forks source link

Provide a justification for not using Observable #3

Open jhusain opened 7 years ago

jhusain commented 7 years ago

Please provide a justification for not using Observable to signal cancellation. The register/unregister mechanism clearly overlaps with the subscribe/unsubscribe methods provided by an Observable and its Subscription. While it's arguable that cancellation tokens do not require a complete notification, it is not clear to me that this warrants expanding the surface area of the language to encompass such similar concepts.

One possible argument I could see for duplicating Observable semantics rather than using composition is that accepting only one callback might enable some performance optimizations. However the same optimizations could be made opportunistically in the event only one handler was attached to the Observable:

// you register for cancellation like this given the current proposal
let subscriptionLike = ct.register(someAction);

// if cancellation token had a "canceled" Observable it would likely be subscribed this way...
let subscription = ct.cancelled.subscribe({ next: someAction });
// ...or this way...
let subscription = ct.cancelled.subscribe({ complete: someAction });
someAction });
// ...but rarely like this:
let subscription = ct.cancelled.subscribe({ next() { /* not clear what would be necessary to do here */ }, complete: someAction });

Note that it is highly likely that the cancel token will be subscribed using only a single handler as there is no value in registering multiple handlers on an Observable which notifies only once and never rejects. Consequently I believe it is possible to get the same performance in the majority of cases by building Cancel Tokens on Observable. Furthermore this approach will not further expand the surface area of the API with very similar types.

RangerMauve commented 7 years ago

Observable.subscribe can take callbacks rather than an observer, so this would still be valid code:

let subscriptionLike = ct.subscribe(someAction);

rbuckton commented 7 years ago

I'm not opposed to renaming register/unregister to subscribe/unsubscribe, but I would be reticent to take a dependency on Observable until it has moved further along in the standards track as WHATWG is moving ahead fairly quickly.

I do expect cancellation tokens will have more than one subscriber, especially when building single-page applications on the web or a client application using electron. However, using Observable here feels like overkill as the signal from a cancellation token can only ever be triggered once and the complete and error callbacks from Observable could lead to confusion for users.

rbuckton commented 7 years ago

Also, if we do not take a dependency on Observable, it might be better to leave the callback registration mechanism naming as-is to reduce possible confusion.

I could foresee CancellationToken having a [Symbol.observable], allowing you to use Observable.from(ct).

RangerMauve commented 7 years ago

I'm not opposed to renaming

That might be a good track to follow, then. If listening for cancellation uses the same interface as Observable, or at least a subset of it and not a superset, then you could easily say "Actually, this was an Observable the whole time" later on.

jhusain commented 7 years ago

There are two principled approaches here:

We should create a new type every time a subset of an existing type's semantics are required for a use case.
"It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures." —Alan Perlis

In this case, I agree with Perlis. The argument that the simplified semantics of register/unregister will be easier to learn argues for a local maximum. The total complexity of learning both the proposed Promise cancellation subscription API and the Observable API is clearly greater than learning only the more general Observable API. If we accept the likelihood that most developers are likely to learn both, then we have made the overall learning curve steeper for developers.

I'm also not convinced that creating a one-off type for cancellation notification actually makes the API more intuitive. How do you intend to communicate that developers do not need to call unregister after being notified? Developers who have experience with either EventTarget or EventEmitter may very well assume that explicit unregistration is necessary. In contrast the completion semantics of Observable clearly sets the expectation that unsubscription is not required.

I think it's difficult to make a principled argument that we need to create a version of Observable without a completion or error semantic for those cases were only notification is required without making a similar argument for Iterables/Iterators. Many useful functions return iterators which never complete. Should we create a new type for these functions to return because they don't happen to use the completion semantic? This is not the norm in most languages with rich iterator libraries.

My advice continues to be to use either Promises or Observables to signal cancellation. Why not simply remove the special Cancel exception and use the original CT proposal?

getify commented 7 years ago

I'm currently renovating a house, so the common "right tool for the right job" adage is more real than ever to me at the moment.

I'm standing on a ladder in front of a wall we're removing, and in one hand I have a regular hammer, and in the other I have a mallet/small sledgehammer. Ostensibly, they are both forms of the same kind of tool: a heavy blunt surface used to impact things (either for building or tearing down). But they are definitely not the exact same tool.

The mallet is much heavier and makes a bigger hole. The hammer is lighter and more precise, but makes a much smaller hole.

There's a fair bit of overlap in these two tools. I can punch a bigger hole with the hammer by hitting a few times in the same area. I can more carefully swing the mallet, on its edge, and make a slightly smaller or more precise hole.

I'm not an experienced construction worker. But even with my amateur wall tearing down skills, I can clearly see that I benefit from having both tools. Different parts of the job are better suited for each "hammer".

The fact that I have more to "learn" -- what a hammer is good for and what a mallet is good for -- is only the slightest of detractions, more than overcome by the fact that my arm muscles and fingers/hands appreciate that each tool is designed specifically for its respective tasks.

Learning happens once, but I swing these tools literally hundreds of times to take down the wall.

I understand the attraction of "oh, look, cool, that thing X has a similar shape to this other thing Y" so "let's just use Y for X". I'm aware that many (including some on this thread) also argue that promises aren't needed because we can just use single-value observables. The claim of this thread that observables can be used as cancellation tokens feels almost identical to that other line of reasoning. And I don't agree with either claim.

Promises are good. Observables are good. And this newly proposed cancellation token tool also seems pretty useful. I'd rather have 3 good purpose-built tools to choose from than one tool that tries to do everything and only does each task so-so.

RangerMauve commented 7 years ago

And this newly proposed cancellation token tool also seems pretty useful

I wasn't saying anything against the rest of the API, it just seems that a small subset of this new API is a way to listen on the cancellation taking place, and that listening has a way to "unlinsten" which looks very similar to how Observables work, I just think that it would make sense to use Observables for this part of the API since it's pretty much the same as Observables with a single onNext listener already.

aluanhaddad commented 7 years ago

@RangerMauve having specialized APIs for distinct concepts, is something that has stood the test of time. Even when moving to increasingly higher levels of abstraction, such distinctions often prove valuable.

Consider the process of learning functional programming.

Early on, when we realize the power, versatility, and composability of something like List, we might be inclined to wonder why functional languages seem to invariably have an Option construct.

Surely we can model Option well enough as a List that is always either empty or is holding precisely one element, and surely by doing so we gain significant flexibility and reuse but, by doing so, we could also be seen as burdening Option with unnecessary complexity by exposing operations that make little to no sense on them.

While it is true that mapping over a List and mapping over an Option both make complete sense, and so it would be great to reuse the mapping capability inherent in List to express the same facility in Option, how should we address operations like grouping and sorting? It makes sense to group a List but I don't know that it makes any sense to group an Option and sorting an Option seems way off in left field.

Even if we look at specific operations over just List, we can see value in having a distinction. Filter and Map can be expressed in terms of FlatMap and FlatMap in turn can be expressed in terms of FoldLeft. Surely, having all of these operations increases the surface area of the APIs which we must learn and yet surely we want to have all of them so that we can express our intent clearly and concisely.

RangerMauve commented 7 years ago

@aluanhaddad Your example of different types in Functional programming doesn't completely apply here.

In this case it's basically the same API and the same use-case, but with slightly different naming conventions and a lack of error handling.

To clarify, I'm only talking about being able to "register" a listener for when the token gets cancelled, I don't think that everything to do with cancellation should be redone with Observables.

In this case it's more like introducing a new DOM API that has it's own flavor of EventTarget that has slightly different names for the sake of not having to use the Event type it provides, or an async API that decided it needed its own Promise-like type rather than leveraging the rest of the ecosystem so that they could have a single node-style callback instead of two callbacks.

Leveraging Observables, the type being added the the language to deal with events, seems obvious for listening on the token cancellation event. Not using it will only make things more annoying for users who will have to wrap over it when they're integrating tokens with the rest of their codebase, we'll need more libraries dealing specifically with combining tokens rather than observable libraries for combining events.

As well, JavaScript isn't known for having lots of different types like functional programming languages we see in the wild. It's got a few, powerful, types which work together to get things done. We don't have a whole bunch of category-theory types that can be made from each other. Adding more types rather than working with the existing ones is trying to make JavaScript something totally different.

Also, what is the main benefit of having a separate type for this here? And what will happen the next time an API is introduced that has a similar use case (listening on an event)?

ericelliott commented 7 years ago

Just chiming in to agree with @jhusain's point.

A cancellation is an event that can (optionally) happen once. It might need a reason for cancellation. It may have one or many observers. This sounds a whole lot like a promise to me.

I don't see much point in over-complicating it.

Further, a promise is basically an asynchronous stream that may produce one value, with an error-handling path. An observable is essentially the same thing, but may produce many values over time.

So, a promise is basically a single-valued observable. If it literally was an observable, you could reuse the same set of utilities to work with promises that we can use with observables.

Yes, different kinds of hammers are valuable, but in situations where you could use a modular socket wrench set or 50 different wrenches in a giant case, I'll pick the modular socket wrench set every time, if only because it rids me of the organization and storage requirements of the alternative.

There is a lot to gain when we settle on a single abstraction to handle these use-cases, and a lot of time to lose to duplicated effort (in learning, in tooling, in wrangling libraries) if we split out a bunch of different APIs for each individual use-case.

Why does cancel token have to be anything other than an observable or a promise?

Volune commented 7 years ago

Why does cancel token have to be anything other than an observable or a promise?

I'm not sure for Observables, but it is not possible to inspect the state of a Promise. Inspecting the state of a CancellationToken is an useful feature at the beginning of an asynchronous operation. For example, you can check the fetchAsync method in the proposal: it will not send the request if it is called with a cancelled token.

I'm not saying a token should or should not extend / internally use a Promise. Just that we cannot use simple Promises as tokens.

ericelliott commented 7 years ago

I'm not sure for Observables, but it is not possible to inspect the state of a Promise.

That's fair, but maybe fetchAsync should be using a lazy API instead of a promise? You don't have to inspect the state of an observable to accomplish preemptive cancellation because observables are lazy -- nothing runs until you tell them to run. If you want to preempt a computation, you should use an API that represents a future computation, not a future value: Pull instead of push.

Promises

Promises represent values, not computations
The computations which produce those values are immediately invoked by default.
There currently are no mechanisms to control computations (when or if they start, cancellation). I recognize that this proposal is meant to address the second part, but using a CancellationToken to prevent a computation from ever starting feels like an awkward stretch.

Tasks

Tasks represent a future computation, not a future value.
The computation can be preempted because it isn't invoked until you call .run().
Cancellation with resource cleanup is a built-in feature, not a tacked-on afterthought.
Tasks could return a promise from .run() so you can still use stuff like async/await with them

Task bonus:

Functions that return tasks can easily be made pure. Most promise-returning functions are not pure because they kick off the computation side-effects immediately. That makes task logic easier to reason about and unit test.

Observables

Like tasks, they're lazy -- you pull instead of push -- observables don't do anything until something subscribes. If you don't want to trigger the side-effects, don't subscribe.
You could build something like a cancellable task on top of the observable API, and you'd get the nice laziness and pure function features for free.

Either option seems like a better fit for the problems CancelToken is attempting to solve, and there's already an active proposal for Observables on the standards track.

RangerMauve commented 7 years ago

This sounds a whole lot like a promise to me.

Only issue with promises is that reacting to resolution is async, sync cancellation could be useful in a lot of cases. That's why Observable would be better suited than Promise for this signaling.

Jamesernator commented 6 years ago

I definitely agree with the points raised that Observable shouldn't be used for cancellation, to me using an Observable for a single value sequence is comparable to using a Array to represent a single number. I just can't see any reason I'd want to do that, it would simply add additional work for zero additional benefit in the 99% case, (sure Array operators might be useful in the 1% case, but they're the exception not the rule).

I understand that people really like that Observables like being able to use the same set of operators on things, but many types conceptually share operators. But data types exist because there are differences in what different data types mean, sure you can represent all data as a single type (that's just lambda calculus) but the reason we don't do that is because it's nice to have strong guarantees about the data we're working with. For example just because .map and .debounce might make sense on Cancellation, .distinct makes no sense.

As a developer my ideal API would be easy to interop with existing types such as Promises while also being easy to convert other types to it (e.g. implementing [Symbol.observable] for those who need it).

It'd look something like this in use (without assuming any syntactic cancellation on await for now):

async function poll(condition, interval, cancelToken) {
    while (!condition()) {
       await Promise.race([
           delay(interval),
           cancelToken.rejectIfRequested(),
       ])
    }
}

// Inside a some resource preloader that preloads resources that are needed
// together but if loading is cancellable we might as well finish
// loading the resource and cache it

const simpleCache = new Map()

async function loadAll(resources, cancelToken) {
    const responses = new Map()
    for (const resource of resources) {
        // Load might not accept a cancellation token but
        // we can still cancel between requests
        if (!simpleCache.has(resource)) {
            const response = await load(resource)
            simpleCache.set(resource, response)
            responses.set(resource, response)
        }
        cancelToken.throwIfRequested() // Synchronous because we don't
                                       // need to race asynchronously
        responses.set(resource, simpleCache.get(resource)
    }
}

// Doing work that requires access to a restricted resource
// This is a real case I've implemented using a variant of CancelTokens

// Doing too many concurrent screenshots completely destroys the ability
// to do work on the machine, in the real application this is dependent
// on the number of cores on the machine
const MAX_CONCURRENT_SCREENSHOTS = 5
const acquire = semaphore(MAX_CONCURRENT_SCREENSHOTS)

// browser supplied by puppeteer
async function takeScreenshot(browser, cancelToken) {
    const unlock = await acquire()

    try {
         const page = await browser.newPage()

         // Logic for taking screenshot here...
         await Promise.race([
             page.goto('http://localhost/...'),
             cancelToken.rejectIfRequested()
         ])

         // ...
         return screenshot
    } finally {
        // Fortunately my specific implementation of unlock is idempotent
        // But if someone had a semaphore that unlock was shared like
        // classic semaphores tend to be naively written as then
        // the multiple emitted values from Observable for performing
        // cleanup logic would be disastrous as it'd potentially
        // decrease the semaphore while another consumer held it
        // or worse put the semaphore negative
        unlock()
    }
}

That last example is one of the reasons I think Observable would be disastrous as if next could accidentally be emitted multiple times when the consumer was under the assumption that a cancelToken should only emit next once means cleanup logic could lead to bizarre bugs that are difficult to debug.

Now sure I could just convert Observable-based cancellations to Promises, but then why bother having the cancellation as Observable? From my experience in using them and from what I've seen of even examples using them if you use Observables for one thing it's fairly impractical not to have to use the for absolutely everything else without a significant amount of boilerplate converting between the two worlds.

For people who think that Observable-based cancellation would actually help rather than simply create more work and obscure intent, then it would be nice to provide demonstrations of how this will work in a nice way that isn't going to be more work for people than having a dedicated type. Certainly I found the old CancelToken API pretty intuitive just from the methods it provides, I'm not even sure how I'd use the Observable-based one 90% of the time without converting to a Promise or to something that's basically CancelToken.

RangerMauve commented 6 years ago

Over in #16 they're talking about a potentially new type for "Signals". Maybe that would be useful to have as a primitive for these sorts of things. Signal => single event (might be sync or async), Observable => Multiple events (might be sync or async), Promise => Value that will be generated asynchronously, AsyncIterable => Values generated asynchronously

ericelliott commented 6 years ago

@Jamesernator

You can take(1) from an observable stream to make sure you don't accidentally get more than one value. You could specialize observable to always take(1) with a function that just adds sugar around the API for cancellable promises (for a similar sugar function using promises for cancel tokens, see speculation and a description of canceling promises with speculations in the "What is a Promise?" blog post).

Would a little sugar around observables tick all the boxes? Your point about .distinct in the context of cancel tokens is a good one, but you can attach operations to observables with transducers, rather than tacking every op in the history of observable operations to every observable instance. Instead of tacking a ton of methods directly to the observable spec in JS, we could make ops transducers and use them on an as-needed basis for the data type you're dealing with like Clojure did when they implemented core-async. Rather than reproduce all the ops for yet another datatype, they modeled all the standard functional operations as transducers, which are agnostic of the transport data type.

In that context, observables are just a lazy subscription mechanism, and you bring your own ops depending on your particular needs and use-case.

ericelliott commented 6 years ago

comparable to using a Array to represent a single number. I just can't see any reason I'd want to do that...

I use arrays to represent zero or one number all the time, so that I can treat it abstractly with the same API without using conditionals -- which is what a cancel token really is: it's the presence or absence of a cancellation signal.

[].map(x => x * 2); // returns [], no errors, no weird NaN values to deal with
// vs
undefined * 2; // NaN -- now we need crazy `if` statement branching

You might be tempted to compare cancelToken to a simple binary state: true or false; CancelRequested, or CancelNotRequested, except that the value will always start out in one state and may transition into another state sometime in the future. That's not a single binary value. It's a signal. It's one or more states expressed over time, which observables are perfectly capable of modeling. And as I've already mentioned, if you only want to listen to one state transition, take(1).