kefirjs / kefir

A Reactive Programming library for JavaScript
https://kefirjs.github.io/kefir/
MIT License
1.87k stars 97 forks source link

Should transformations applied to properties return streams? #147

Closed rpominov closed 8 years ago

rpominov commented 9 years ago

I think this topic needs a separate issue (was originally started in #140 and continued in #142).

/cc @cefn

cefn commented 9 years ago

I am comfortable with learning from corner cases that Property is not the simple cache that I thought it was, (as documented in the threads you mention), and not to argue for Property being something it's not.

I acknowledge we shouldn't overlook the convenience of smart defaults (the suggestion that if Propertyness is always wanted, then Property contagion has no cost and transformations should somehow be expected to maintain the Property contract).

However, the effort required of dealing with these corner cases for all transformations does suggest it would be simpler and more predictable to avoid this duality in every operation, assuming there's no cost to the expressivity available in Kefir.

My understanding is that you can always resume Property behaviour explicitly using toProperty() on an active stream but the opposite is not true.

To complement this simplification (all transformations turned into streams), an alternative primitive Observable transformation could be provided, perhaps cache() or subscriberCache(), corresponding with a stream behaviour of saving the last-served value, and re-serving it to each future subscriber.

This is what I seem to always need and I presume would be much much simpler to implement and easier to predict. Because Property seems to have a semantic rather than an operational definition, it leads to potential ambiguities depending how the author might interpret its semantic role in corner cases (which come up a lot in transformations).

For example, my instinct is that any loss of information (values being ignored as in https://github.com/rpominov/kefir/issues/142#issuecomment-138069622 ) could, would and should be explicit, e.g. through a consumeSimultaneous() or momentary() invocation, determining that every value emitted in a single stack call would be thrown away, except the very last one. This may certainly be employed to eliminate efficiency issues e.g. in UI implementation - avoiding repaints from flushed buffers containing multiple values.

However, I'd be curious to know the circumstances in which the combination of explicit cache() and momentary() wouldn't deliver more explicitly what Property does and with a huge reduction in complexity and possibly efficiency (e.g. all intermediate transformations can be just streams), and if so whether any of the divergences between toProperty() (with contagion) and momentary().cache() (with no contagion) are actually desired.

I believe the combination of dropping Property contagion and making cache() and momentary() available to make these features of Property explicit would simplify the implementation of Kefir and avoid unusual corner cases and unexpected semantic interpretations of what is 'Property' as discussed throughout the referenced threads.

cefn commented 9 years ago

For reference the alternative strategy based on momentary() and cache() is illustrated by the gist at... https://gist.github.com/cefn/f73380591630171f0e83

Where the invocation...

var cached = cacheStream(onceStreamFactory());
cached.log("once");
cached.log("twice");

...produces the following logging...

once <value:current> 1
once <value:current> 2
once <value:current> 3
twice <value:current> 3

While the invocation...

var momentary = momentaryStream(onceStreamFactory());
momentary.log("once");
momentary.log("twice");

...produces the logging...

once <value> 3
twice <value> 3

_Update 1_: The momentary() transformation might amount to the same as debounce(0) (for streams at least).

_Update 2: I speculate that the current Property contagion behaviour in Kefir is equivalent to calling .momentary().cache() after every transformation, (and perhaps the preferred behaviour would be momentary().cache().onValue(.noop) to additionally force all Properties to be active according to some contributors in https://github.com/rpominov/kefir/issues/43 ).

_Update 3_ I wonder whether transformation contagion (like first-subscriber trigger and every subscriber trigger) might be worth considering as a first-class behaviour. That is to say a stream can subscribe to intercept the creation of all future downstream sinks, and transform them before they are returned to the caller). This would allow Property contagion to be an opt-in, but also permit a whole load of other opinionated behaviours about stream 'nature' to co-exist through a simple callback signature.

aindlq commented 9 years ago

my two cents. I don't think that this is a good idea. Streams and Properties are two different beast. And they have different semantics, actually intentionally. And I think this duality is unavoidable.

From the original definition (see http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.23.360&rep=rep1&type=pdf): "There are two key polymorphic data types in FRP: the Behavior and the Event. A value of type Behavior is a value of type a that varies over continuous time. " "A value of type Event a is a time-ordered sequence of event occurrences, each carrying a value of type a" I would recommend to read this paper, at least the beginning + examples.

Some parallels with real world. Imagine that you have some digital thermometer, in fact values that it produce is a Stream - stream of measurements. But what you see on the display is a Property - current measurement. It is possible to get from Stream to Property, but it is impossible to convert Property back to the original Stream. So Property is not a Stream per se.

I'm using both Streams and Properties extensively, and very happy with semantics that they have. That actually was the reason for me to choose bacon and then kefir, but not something like RxJs. Actually all corner cases that I run into were related to this implicit transformation between Properties and Streams. E.g scan on Stream returns Property etc.

cefn commented 9 years ago

Hey, thanks, @aindlq . That's a really interesting reference and it's focused a perspective on another, orthogonal, intention embedded in Properties.

I don't know that Behaviour in the paper maps to the core aspect of Property in Kefir we're discussing, (as covered at https://rpominov.github.io/kefir/#about-observables and https://rpominov.github.io/kefir/#current-in-streams ). In Kefir, the reliable feature of Property seems to be about the existence of a defined current value, based on the last served value, which I've referred to as a cache, which is served to new subscribers. So if anything this aspect of Property takes it closer to the definition of Event in the paper, with discontinuous changes to values, (but with a cache for new subscribers).

However, Behaviour in the paper (a continuously defined function which can be collapsed by evaluation to a value through sampling/binding its variables - on a pull model) does relate a little bit to an optional feature of Kefir Properties - the getCurrent callback ( https://rpominov.github.io/kefir/#to-property ). This 'pull model' only really affects the first value served by a Property on activation (where activation is the 'pull' sampling action as described in the paper). This functional evaluation is then overridden by future cached-served-values, so getCurrent soon gives way once again to the evented 'push' model where Property behaves like a stream of discontinuous value-change events (with a cached current value), and in which the functional definition in getCurrent is ignored (assuming I now have toProperty() correct in my mind).

The fact that both of these quite distinct aspects are both considered to be 'Property' nature does give me a slight worry. Things which are so distinct seem like they should be separate and I'd also be more confident if each were explicitly identified rather than implied in the semantic intent of Property.

I guess I was hoping the trick of everything being a stream would help to make clear each of the different ways in which streams could be augmented with atomic but distinct and operationally-defined features, making the results more comprehensible/predictable.

For example from this discussion it's emerged that Property has not one, but two or more key features, possibly both aspiring to some semantic intent, but which could be explicitly separated in API terms with different names and expressed in stream terms...

I have to admit the whole endeavour is probably flawed if ultra-local Stream operations are not rich enough to express everything needed, but so far they appear to be rich enough as far as I can see. Additionally AFAIK the features which co-exist in Property, (cache-oriented, getCurrent-oriented and more richly semantically-committed like the flatten behaviour characterised in https://github.com/rpominov/kefir/issues/144 ) probably don't need to be collapsed into a single concept and my suspicion is there is a real cost in doing so.

Relying on Property semantics (there being an interpretation of the purpose of Property) seems to lead to ambiguity, and unavoidably arbitrary definitions which are impossible to get right, especially when doing transformations.

By contrast a separate specification of the different behavioural aspects of Property - described by implementation and ultra-local to Streams without 'Propertyness contagion' throughout a pipeline - could eliminate many of these concerns.

The puzzle which remains for me is to identify any aspects of Property which can't be expressed in Stream terms (although we certainly need the additional expressivity of triggers on first-subscription and every-subscription, possibly other key triggers?).

Related to this is the question whether Property is a single concept, or many actually many concepts snuck in to a single implementation. Having multiple aspects in one leads to surprising consequences for people who are focused on one feature (getCurrent) but also getting others in the same package (cache and other semantics).

When you say Property currently serves your needs @aindlq, do you consider Property nature to be about caching? getCurrent? semantic commitments like https://github.com/rpominov/kefir/issues/144 ) ...something else?

Do you feel they actually map to the paper's description of Behaviour in their current form (in which case I need to revisit a lot of things to get my ideas straight, I really don't see this).

@aindlq in your use of toProperty() would cache() or momentary() or both (see https://github.com/rpominov/kefir/issues/147#issuecomment-138285060 ) be equivalent for your needs?

cefn commented 9 years ago

I've been experimenting with the performance and compositional consequences of an alternative stream library approach. It eliminates Properties as first-class elements. I put together a draft of a library which takes these and other minimal design strategies to their logical conclusion.

The test case outlined below suggests a large speedup from following the minimal approach when it comes to simple transform sequences. It constructs a series of 1400 map (+1) operations fed with a single value of 0, timing the promise of the event which pops out the stream at the end.

The results shown are the second run of the test in Node (allowing V8 to fully load all the code, which otherwise skews the results).

Event generated 1400
AS took 3 ms
Event generated 1400
Kefir took 15 ms

Initialising the map sequence has a similar payoff, with Kefir taking 5ms to create the sequence of 1400, and Angstream taking just 1ms within the V8 debugger.

When I run it without the V8 debugger enabled, presumably allowing JIT compilation, the difference is even bigger!

Event generated 1000
AS took 3 ms
Event generated 1000
Kefir took 26 ms

I've been codenaming the library Angstream as it's intended to be as tiny as possible in resource commitments. This is as 'breaking' a suggestion as there can be, so it made sense to explore it as a substitute library implementation. It may provide some ideas which could be reused somewhere in Kefir.

It includes no notion of Property, but it attempts to recreate the equivalent functionality through 'onsubscribe caching'. All operations modify the stream's behaviour directly, using an internal transformation stack based on the canonical signature of Kefir#withHandler(). If you wish to add a transform without changing the behaviour of the original stream, you first branch() the stream and then the original remains intact as you perform further modifications.

The reason for the choice of 1400 is that if you create a map sequence of 1500+ then Kefir exceeds the stack allocation of the Node VM, (the Angstream test case can go to a sequence of 5000 before exceeding the available stack frames).

I believe there are a few more advantages too. I have found that debugging stream logic may be easier in the case of Angstream as the stack is more easily inspected; you can literally see the stream transformation sequence in e.g. the V8 stack debugger in Webstorm with just a few comprehensible stack frames in between.

To get the same performance and debugging benefits a middle way exists, where the equivalent of Angstream's 'Pipe' can offer up a withHandler function to Kefir containing a sequence of withHandler functions as middleware. This would make it possible to employ the same strategy as Angstream within Kefir without changing the API. It seems from experiments so far that a library of middleware functions exposed in a withHandler format could be provided with equivalent expressivity to the available Kefir stream transformations.

The test case looks like this...

    it("Compare Kefir and AS", function() {

        function Case(){
            this.target = 1400;
            var count;
            for(count=0;count<this.target;count++){
                this.mapStream(function(value){
                    return value + 1;
                });
            }
        }
        Case.prototype = _.create(Object.prototype, {
            promiseDelay:function(){
                var start = Date.now();
                var promise = this.promiseEvent()
                    .then(function(value){
                        console.log("Event generated " + value)
                    })
                    .then(function(){
                        return Date.now() - start;
                    });
                this.emitter.emit(0);
                return promise;
            },
        });

        function KefirCase(){
            var that = this;
            this.stream = Kefir.stream(function(emitter){
                that.emitter = emitter;
            });
            Case.apply(this, arguments);
        }
        KefirCase.prototype = _.create(Case.prototype, {
            mapStream:function(fn){
                this.stream = this.stream.map(fn);
            },
            promiseEvent:function(){
                return this.stream.take(1).toPromise();
            }
        });

        function AsCase(){
            this.stream = new AS.Stream();
            this.emitter = this.stream.in;
            Case.apply(this, arguments);
        }
        AsCase.prototype = _.create(Case.prototype,{
            mapStream:function(fn) {
                this.stream.map(fn);
            },
            promiseEvent:function() {
                var limited = this.stream.branch();
                limited.take(1);
                return AS.promiseLast(limited);
            }
        });

        var kefirExample = new KefirCase();
        var asExample = new AsCase();

        return Promise.resolve()
            .then(function(){
                return asExample.promiseDelay();
            })
            .then(function(value){
                console.log("AS took " + value + " ms");
            })
            .then(function(){
                return kefirExample.promiseDelay();
            })
            .then(function(value){
                console.log("Kefir took " + value + " ms");
            })
    });

You can see a rough draft of the Angstream library at this gist... https://gist.github.com/cefn/99c69c0ab44b091a9ea6

...and some examples of invocations which behave more-or-less as they should at... https://gist.github.com/cefn/a0e12f88ebd7917c4125

It's very much "work in progress" and has plenty monkey-patches and untested operations. I've got some basic scenarious logging sensible results from filter() map() flatten() sample() interval() concat() take() promiseLast()

I won't have a chance to harden it beyond the few hours I've spent knocking it together to sanity-check the approach. A lot of the work has been verifying the 'middle way' of withHandler()-compatible middleware which may be needed to hit performance demands within the application I'm working on, but that will be embedded within Kefir streams so the draft Angstream Stream and Emitter implementations won't play a part and will have to be revisited in the future if I ever get a chance.

rpominov commented 8 years ago

Probably won't happen closing for now to cleanup open issues.