Generators are not observers

zenparsing commented 9 years ago

We've been struggling for quite a while now with the issue described and discussed in #14. The original question was: should canceling the subscription result in calling return on the observer?

RxJS does not invoke onComplete on cancelation. From the Rx point of view, "canceling the subscription" means that the consumer is telling the producer that it doesn't want any more notifications, so it wouldn't make sense to send it a completion signal.

The desire to call observer.return on cancelation was motivated by the fact that observers can be implemented using generator functions. A generator is state-full and may require explicit cleanup of resources upon termination. When the subscription is canceled, we need to signal the generator that it should execute its finally blocks and perform cleanup. Therefore, we need to invoke return on the generator.

There have been a couple of different proposals for solving this dilemma:

Add a .NET-like dispose protocol to JavaScript and add a dispose method to generators. The dispose method would be executed on cancelation. For generators, it would simply call this.return(undefined).
Add additional methods to the subscription object which would allow the subscription to be terminated with arbitrary abrupt completions. The holder of the subscription would choose the appropriate cancelation behavior.

Both solutions require adding complexity to the system. They only differ in where that complexity is located. Where is this complexity coming from?

The complexity arises from the fact that we are attempting to express Rx observers with generators, and there is an impedance mismatch between them. The complexity creep arises when we attempt to resolve this mismatch.

Some of the differences include:

Generator	Rx Observer
Methods return `IteratorResult`	Callbacks return `void`
Returning `{done: true}` signals cancelation	Calling `subscription.dispose` signals cancelation
Requires cleanup with `return`	Does not require explicit cleanup

I believe that pursuing the current strategy of conflating Rx observers with generators is going to result in a confusing mess. I think we need to choose between them. Either we take the observer-callback approach:

let observable = new Observable((onNext, onError, onComplete) => {
    // ...
});

observable.subscribe(
    val => console.log("item: " + val),
    err => console.log("error: " + error),
    ret => console.log("complete: " + ret)
);

Or we take the generator approach:

let observable = new Observable(sink => {
    // ...
});

observable.subscribe(function*() {
    try {
        // Yield loop here
    } finally {
        // Cleanup
    }
});

// Or with a manually written generator:
observable.subscribe({
    next(val) { },
    throw(err) { },
    return(val) { },
});

If we can choose one of these models, I think the rest of the design will fall into place.

domenic commented 9 years ago

+1. As I outlined in https://github.com/zenparsing/es-observable/issues/14#issuecomment-116365675, there is no intrinsic connection or duality with generators; instead, the interface is a dualization of iterators, with methods renamed to pun on the generator method names. That pun thus allows, via duck-typing, the (ab)use of generator functions as subscribers, but in my opinion it is quite awkward (priming, infinite loop).

@zenparsing, you argue in https://github.com/zenparsing/es-observable/issues/33#issuecomment-117231632 that that awkwardness is just part of the model for using generators "as data sinks." This is interesting and I am not sure what to think about it, mainly because I never really became comfortable with the sink half of the generator API; my experience is mostly from C# which does not have that. Maybe Python has enough experience here to help us.

One thing to keep in mind is that it is very easy to write a translation function, with whatever fidelity the protocol allows, to allow generator usage if people really want to. I personally think

prime(function*() {
  let errored = false;
  try {
    while(true) {
      handleValue(yield);
    }
  } catch (e) {
    errored = true;
    handleError(e);
  } finally {
    if (!errored) {
      handleReturn();
    }
  }
});

is a lot less appealing to write than

{
  onNext(v) { handleValue(v); },
  onError(e) { handleError(e); },
  onComplete() { handleReturn(); }
}

so I am not sure who wants to do that. But they can write that function if they want.

zenparsing commented 9 years ago

@domenic Here's my attempt to write that function:

function subscribeGenerator(observable, generatorFunction) {

    // Create and prime the generator
    let generator = generatorFunction();
    generator.next();

    let subscription = observable.subscribe(
        val => {
            let result = generator.next(val);
            if (result.done)
                subscription.unsubscribe();
        },
        err => { generator.throw(err) },
        ret => { generator.return() }
    );

    return {
        unsubscribe() { 
            subscription.unsubscribe();
            generator.return();
        }
    };
}

let subscription = subscribeGenerator(observable, function*() {
    // All that try-finally-yield-loop stuff
});

benjamingr commented 9 years ago

@domenic

+1. As I outlined in #14 (comment), there is no intrinsic connection ... with generators;

If that's OK I want to address this. The connection is that subscriptions are basically sequences of values and generators in JavaScript are both producers and consumers of values.

Modeling observers with generators makes sense in that we're using the "consumer" aspect of generators to yield values into the observer.

The fact they are not dual I think was shown by both of us in #14 and now that I understand the proposal better it's crystal clear why - since generators are both producers and consumers of values. This I think sums up both your point and mine in #14 as well as what I believe @zenparsing is saying here above.

@zenparsing I think we can keep using generators if:

We stop claiming duality, that has not been helpful so far (except in identifying problems).
We add dispose semantics on the subscription, we can call .return but honestly - although it was asked about multiple times we don't actually need cleanup on the observer from what I understand. It might be beneficial to handle cleanup somehow but certainly not in a confusing place like finally blocks of the generator - that makes sense but will confuse people.
Bye bye priming. I know you disagree it's a footgun but we've seen it's unnecessary by accepting generators directly (like you've stated here), or even by keeping track using a WeakMap.

While we're at it, I think it could be great if subscribe() returned a thenable, our earlier conclusion was to not tie cancellation semantics with this proposal - which I agree with. It can be a regular promise with an additional dispose method we can specify as we please with no relation to promises - this means forEach and subscribe can be merge and observables can be awaited.

zenparsing commented 9 years ago

Here's an attempt to write a function that adapts the other way (from generator-style to callback-style):

function subscribeRx(observable, onNext, onError, onComplete) {

    let subscription = observable.subscribe({

        next(val) { 
            onNext(val);
            return { done: false };
        },

        throw(err) { 
            onError(err);
            return { done: true };
        },

        return(ret) { 
            if (!canceled)
                onComplete(ret);

            return { done: true };
        },
    });

    return {

        unsubscribe() {
            canceled = true;
            subscription.unsubscribe();
        }
    };
}

I've left out the cases where the user doesn't provide one of the callbacks, but that's easy enough to add.

After looking at both of these adapters, I'm even more convinced that we need to choose either generators or callbacks for the core design, and allow the user to adapt as desired.

I would actually lean toward the generator-based design, with one simple addition: add a "unsubscribed" getter to the subscription which will allow the user to determine whether return was invoked by unsubscription or by source completion. This will make writing manual generators a little easier.

let subscription = observable.subscribe({
    next(val) { 
        // Whatever
    },
    throw(err) {
        // Whatever
    },
    return(val) {
        if (!subscription.unsubscribed)
            onCompleteStuff();
        cleanupStuff();
    },
});

I prefer this approach. (Which basically means we stick with the design as currently specced.)

zenparsing commented 9 years ago

@benjamingr I'm fine with using generators instead of rx-style callbacks (in fact I slightly prefer it), but I'm strongly opposed to extending GeneratorPrototype with dispose or anything else. As you stated elsewhere, for generators return is dispose. : )

benjamingr commented 9 years ago

@benjamingr I'm fine with using generators instead of rx-style callbacks (in fact I slightly prefer it), but I'm strongly opposed to extending GeneratorPrototype with dispose or anything else. As you stated elsewhere, for generators return is dispose. : )

Just because I stated something elsewhere doesn't mean it's right :D

As for return, I agree that it is dispose in a sense but the behavior with finally blocks is something that's going to be very non-obvious for users.

That said, if generators are accepted to subscribe and the subscription can be awaited (has then) then two large issues I've had are resolved (priming, and await support). Still, I think we should stop claiming duality (for reasons stated above) and I'm very unsure about the method names (they're not obvious or intuitive).

By the way, I think it would be a great service (and help push the proposal) - if there was a page that:

Explained observer, observable and subscription as entities and what part they play.
Explain why observers model to generators (I think @jhusain gave a talk about it somewhere) and how that plays out.
Give motivating DOM usages (this was requested in the TC meeting IIUC) that also explain observer, observable and subscription.
Motivate different types of cancellation.

Either a wiki page or a regular markdown page would be good IMO.

spion commented 9 years ago

Is there any discussion anywhere why the observer's return method can receive a value? AFAICT onComplete in RxJS doesn't receive any values, so from that perspective it seems weird that this proposal adds it...

benjamingr commented 9 years ago

@spion generators are producers and consumers, from the "using generators" point of view that's what the generator interface takes. For a regular generator in other cases it does make sense for the generator's return to take a value.

spion commented 9 years ago

Thats why I keep differentiating between return-the-method and return-the-keyword. I'm talking about return-the-method here. To my knowledge, the generator-based observer cannot in any way take the value passed to it using return-the-method and do something with it, whereas a non-generator observer would be able to do that.

The question is why would an observer even have the ability to receive a value there - seems to me that it not only that makes the interface subtly incompatible with generators, but it is also different from the RxJS (and Rx.*) implementation(s).

Let me try to demonstrate

observable.subscribe({
  next(x) { console.log("Got value ", x); }
  throw(e) { console.error("Got error ", e); }
  return(v) { console.log("Got return value", v);
})

How does that last method work with a generator:

observable.subscribe(prime(function* () {
  let errored = false;
  try {
    while(true) {
      console.log("Got value ", yield);
    }
  } catch (e) {
    errored = true;
    console.error("Got error", e);
  } finally {
    if (!errored) {
      console.log("Got return value ", ???); // what do I put here?
    }
  }
}));

To get that, you'd have to have something like a returned keyword allowed in generators - then you could ditch finally except for cleanup, and just write

observable.subscribe(prime(function* () {
  try {
    while(true) {
      console.log("Got value ", yield);
    }
  } catch (e) {
    console.error("Got error", e);
  } returned (v) {
    console.log("Got return value ", v);
  }
}));

and as a bonus, not have to track errored state all the time.

Am I right, or did I just sleep through something like returned (...) { ... } being added to the language?

zenparsing commented 9 years ago

I reimplemented the mouse drag demo using a callback-only/no-generator API and here's what it looks like:

https://github.com/zenparsing/es-observable/blob/callback-style/demo/mouse-drags.js

This is the same demo using the generator-based API:

https://github.com/zenparsing/es-observable/blob/always-async/demo/mouse-drags.js

It seems to me that the advantages lie with the callback style:

It requires a bit less typing.
Callbacks are easier to pass around than methods on objects.
Callbacks are more similar to the Promise API.
The callback style is easier to grock.
The callback style doesn't have the return/cleanup problem.

In fact, since we aren't using objects as observers at all, we can sidestep the dispose issue. That can be left to higher-level abstractions.

I now think I prefer a callback-only API. Thoughts?

benlesh commented 9 years ago

With the current spec:

~99% of the time, you'll want the consumer to call dispose on your subscription/generator, because your consumer won't care about being notified that it's unsubscribing.
< 1% of the time (when you're actually using a generator as an observer), the consumer will want to call return() on the returned subscription/generator so it's observer/generator will clean up.

Maybe you had it right the first time?

If I had a struggle with this, it would be "why use a 'primed' generator-function as an observer at all?" it just seems like a mind-bending and unergonomic way to use Observable. That said, there might be some hidden and extremely valuable use case for this.

As an RxJS implementor, I could see a subscribeWithGenerator method coming out of this at the very least, that would, specifically, invoke return a generator instead of a subscription, and prime said observer/generator.

I'm starting to think it might be better to go with a design more similar to the original, where there is only a subscription object returned that has a dispose or unsubscribe method on it. And I think having the goal of using primed generator functions has observers, while interesting, is of little value.

I'm, of course, happy to be shown otherwise. These are weakly held opinions.

benjamingr commented 9 years ago

@zenparsing I tend to agree with your conclusions about the callback API, there is no way that a DOM event listener would look like:

el.events.click.subscribe(function*(){
    while(true){ 
        var e = (yield);
        console.log("Hi", e);
    }
});

Or even:

el.events.click.subscribe( {
    next (e) { console.log("Hi", e); }
});

When the API most people have been using since 8 years ago is:

el.click(e =>  console.log("Hi", e);

The biggest advantage of even using a generator here in the first place is to keep state between calls and I'm not even sure that's important (because closures) or good practice (because state in observables).

benjamingr commented 9 years ago

Although I'm waiting for @jhusain to weigh in here and make a strong case for generators again :)

benlesh commented 9 years ago

@benjamingr

el.events.click.subscribe( { next (e) { console.log("Hi", e); } });

It's probably more like:

el.clicks.forEach(e => console.log('Hi', e));

... if we're going to be fair about it.

FWIW: In RxJS, I'd have a "shorthand" override of subscribe that accepts 3 handlers instead of an object. It, unfortunately makes subscribe polymorphic for that library, but most of the internals use Symbol.observer anyhow, which I'll keep monomorphic.

benjamingr commented 9 years ago

@blesh the point I was making is that in that case a generator based API looks really weird.

(FWIW, I'm not arguing that observables have bad syntax, just that I'm not sure generator-iterator things as observers in most usual cases would be too common - I find RxJS quite ergonomic and with nice syntax in our code)

benlesh commented 9 years ago

the point I was making is that in that case a generator based API looks really weird.

I agree. I also don't think there are a large enough number of use-cases for using generators as observers to support muddying the behavior of the disposal process.

jhusain commented 9 years ago

Simply creating a generator with an endless while loop doesn't seem to be better than a simple callback.

observable.subscribe(function*() {
  while(token = yield) {
    // do something
  }
});

However these trivial examples don't showcase the true power of combining generators and push streams. The following code creates a push parser for a simple Lisp language. It showcases how we can use generators and yield* to efficiently transfer control from a parser function to smaller parser functions. This highlights the strength of generator functions over callbacks: generator functions allow you to control flow without building state machines. This code runs in FF today.

// abbreviated Observable polyfill
if (typeof Symbol.observer === "undefined") {
    Symbol.observer = Symbol("observer");
}

function Observable(observerFn) {
    this[Symbol.observer] = observerFn;
}

// create a push stream of the characters in a simple Lisp program
var simpleLispProgram = new Observable(function(observer) {
    var chars = "(add 5 3)".split(""),
        count,
        iterResult = {done: false};

    // cheating and ignoring IterationResult, can be fixed easily though
    chars.forEach(key => observer.next(key))

    // Happens to be sync, so we can just return observer
    // Would be more complicated if characters were
    // arriving async, but only the Observable defn would
    // change. Consumption code stays the same.
    return observer;
});

function prime(genFn) {
    return function() {
        var generator = genFn();
        generator.next();
        return generator;
    };
}

var char;

function* parseIdentifier() {
    var result = "";

    while(' )'.indexOf(char = yield) === -1) {
        result += char;
    }

    return result;
}

function* parseInteger() {
    var result = char;

    while(/[0-9]/.test(char = yield)) {
        result += char;
    }

    return parseInt(result,10);
}

function* parseExpression() {
    var identifier, args;
    char = yield;

    if (char === '"') {
        return yield* parseString();
    }
    else if (/[0-9]+/.test(char)) {
        return yield* parseInteger();
    }
    else if (char === '(') {
        identifier = yield* parseIdentifier();
        args = [];
        while(char !== ')') {
            args.push(yield* parseExpression());
        }

        return {ident: identifier, args: args};
    }
};

simpleLispProgram[Symbol.observer](prime(function*() {
    var ast = yield* parseExpression();
    console.log(JSON.stringify(ast, null, 4));
})());

Building the same program with callbacks would either necessitate expensive composition, or a monolithic state machine. It would be very hard to decompose the parseExpression function into smaller functions. I still believe that observer callbacks are a special case of the generator interface. If we throw out generators we are giving up a lot of expressiveness that's easily available to use for free thanks to ES6 generators.

Of course we can always convert from generators to callbacks. However this is lossy. Consider that the IterationResults allows Observables to be fully synchronous while also allowing sinks to unsubscribe. This wouldn't be possible with the Rx-style onNext, onError, onCompleted callbacks.

There are very important semantics we lose if we give up generators for callbacks. If we provide an API that converts callbacks into generators we maintain maximum semantic expressiveness.

domenic commented 9 years ago

expensive composition

Can you please benchmark? Generators are generally going to be slower, given the contortions they force JITs into.

Of course we can always convert from generators to callbacks. However this is lossy. Consider that the IterationResults allows Observables to be fully synchronous while also allowing sinks to unsubscribe. This wouldn't be possible with the Rx-style onNext, onError, onCompleted callbacks.

Can you expand? E.g. show a program that cannot be done with the Rx-style callbacks? I do not understand what is lost here, especially if you're talking about ES 2015 generators instead of a hypothetical future evolution of them.

benjamingr commented 9 years ago

@jhusain your LISP parsing example is nice but I don't think it showcases an advantage for generators at all. Like you said, in the vast majority of simple subscription examples a callback is less cruft.

In the LISP parsing example - what would be lost in it by doing:

let toCallback = iter => e => generator.next(e);  // can prime here too if we want
}

Which would let me subscribe to the token observable via:

tokens.subscribe(toCallback(function*(){
    // same generator logic as we had in your example
}()); // we might also want to prime here if we didn't in `toCallback`

Can you show an example using throw and return that would showcase something that isn't achievable with similar syntax with callbacks?

zenparsing commented 9 years ago

@jhusain A callback API can be adapted to accept a generator using a userland subscribe-wrapping function, as the subscribeGenerator function above shows. (This also includes unsubscribing when the generator return { done: true }, and performing shutdown via generator.return on unsubscription.) As a bonus, such a function can also do auto-priming.

The callback API can be adapted to any push interface you like, so in a sense it's more low-level. It's also generally a bit more ergonomic in the common case.

jhusain commented 9 years ago

@domenic it's true that generators are not currently optimized. However there are certain types of operations which could theoretically be implemented more efficiently using yield*. Concatenation of Observables is one way of transferring control flow from one data source to another. However concatenation of observables does not have a linear cost, because each concatenation produces an extra method call per value.

Conversely a compiler can fuse together two streams more efficiently using yield*. I fully admit that today generators are slow, as are native Promises and some other newer features, but that doesn't mean that they can't get faster in the future. That said, I agree that if someone wants to use a generator by piping data to it from a call back API they can certainly do that.

Given that we can convert from generators to callbacks and vice versa, the only thing that I see that would be lost by giving up generators is the synchronous unsubscription. This could be accomplished by allowing any of the callbacks to return a Boolean indicating whether they want to continue receiving values. If there was some facility for synchronous unsubscription, I would be happy to drop generators from the contract and sidestep the issue of a dispose method on generators.

benjamingr commented 9 years ago

If there was some facility for synchronous unsubscription, I would be happy to drop generators from the contract and sidestep the issue of a dispose method on generators.

I think we have reached consensus. Awesome.

What did you have in mind for synchronous unsubscription?

jhusain commented 9 years ago

Returning true would be ergonomic because it could be easily ignored. It would also match done:true and handled = true.

zenparsing commented 9 years ago

@jhusain That's seems pretty good to me. We already have a less flexible way to synchronously unsubscribe from the next callback: by calling subscription.unsubscribe. Of course, we have to capture a reference to the subscription for that to work.

let subscription = obs.subscribe(value => {
    console.log(`Got value ${ value }`);
    subscription.unsubscribe();
});

Compared to:

obs.subscribe(value => {
    console.log(`Got value ${ value }`);
    return true;
});

Do you think that the "returning true" feature pulls its weight given the ability to call subscription.unsubscribe?

benlesh commented 9 years ago

At first I thought this was really cool, but the more I think about it, it just seems to me like using a generator to subscribe to an observable is an edge case. I would expect extremely low usage of this feature, and I don't think it's worth muddying the disposal behavior to support it.

If the Observable behavior was back to what it was originally doing (the more RxJS2-like approach), writing a method to subscribe with a generator and return a generator would be easy.

jhusain commented 9 years ago

There's a race condition there. You can call subscription.unsubscribe if the data source firehoses data at you. This will happen when Symbol.observer is invoked. You need a synchronous unsubscription mechanism.

JH

On Jul 5, 2015, at 7:23 PM, zenparsing notifications@github.com wrote:

@jhusain That's seems pretty good to me. We already have a less flexible way to synchronously unsubscribe from the next callback: by calling subscription.unsubscribe. Of course, we have to capture a reference to the subscription for that to work.

let subscription = obs.subscribe(value => { console.log(Got value ${ value }); subscription.unsubscribe(); }); Compared to:

obs.subscribe(value => { console.log(Got value ${ value }); return true; }); Do you think that the "returning true" feature pulls its weight given the ability to call subscription.unsubscribe?

— Reply to this email directly or view it on GitHub.

zenparsing commented 9 years ago

@jhusain Not sure I understand?

When we call subscription.unsubscribe, it will synchronously release the reference to the observer callbacks and prevent any more notifications from being sent. Perhaps you're referring to the difference on the producer side?

With a return value, the producer could look at that and decide to break out of a loop, for instance. I think you could do the same sort of thing with unsubscribe though.

With a return value from next:

let obs = new Observable((next) => {
    for (let i = 0; i < 10000; ++i) {
        let returnValue = next(i);
        if (returnValue) break;
    }
});

Using unsubscription:

let obs = new Observable((next) => {
    let stop = false;
    for (let i = 0; i < 10000; ++i) {
        next(i);
        if (stop) break;
    }
    return _=> { stop = true };
});

zenparsing commented 9 years ago

The second example above doesn't necessarily work because the observer doesn't have a reference to the subscription until after subscribe completes. If the subscriber function pushes values synchronously the observer won't be able to stop it.

Thanks @jhusain for clarifying my thinking on that.

benjamingr commented 9 years ago

@zenparsing I think this can be closed now that consensus has been reached, I think no one objects to synchronous unsubscription (in fact, I'm not even sure why unsubscription would ever be asynchronous anyway).

zenparsing commented 9 years ago

I'm not 100% convinced that returning a truthy value to indicate unsubscription is the best way to go, but I don't have an objection at the moment, so...

benjamingr commented 9 years ago

That could be a separate issue :)

On Wed, Jul 8, 2015 at 6:11 PM, zenparsing notifications@github.com wrote:

I'm not 100% convinced that returning a truthy value to indicate unsubscription is the best way to go, but I don't have an objection at the moment, so...

— Reply to this email directly or view it on GitHub https://github.com/zenparsing/es-observable/issues/35#issuecomment-119619055 .

tc39 / proposal-observable

Generators are not observers #35