kefirjs / kefir

A Reactive Programming library for JavaScript
https://kefirjs.github.io/kefir/
MIT License
1.87k stars 97 forks source link

Making all observables active by default? #43

Closed rpominov closed 8 years ago

rpominov commented 9 years ago

Kefir has feature that allows lazy subscription to observable sources http://pozadi.github.io/kefir/#active-state Other libraries also has this feature, Bacon works same way, and in RxJS one is able to turn it on.

This feature is pretty great, it allows to achieve good performance in case of creating a lot of observables for later use. For example one can safely create a mouseMoves stream, then attach to it a .map(heavyWork), but until result stream has at least one subscriber the heavyWork function won't be called (we won't even subscribe to mousemove DOM event).

Although this feature creates some annoying issues with stateful observables. Here is some links describing the problem:

One of possible solutions is to make all observables active by default (maybe with ability to optionally turn on lazy subscribing). This of course a big change to how library works, and for now it feels like lazy subscribing is basically a good thing, and we shouldn't disable it. But maybe disabling it is actually the way to go.

I created this issue to discuss all possible pros and cons, and to collect all information on the topic here.

jensklose commented 9 years ago

Another possible solution is to define new active state observables. This would not break the existing code base. Something like: Kefir.active() Kefir.activeStream()

mhelvens commented 9 years ago

I think that lazy subscription is fundamental to a good FRP library, especially one such as Kefir, which is specifically designed to be fast. So I don't think you should make 'active' the default state of any kind of observable.

Ideally, you want to have the notion of active/inactive/lazy encapsulated away. Observables should always act as though they are active, and be internally lazy whenever possible.

There's really only a potential problem with laziness and Propertys, not Streams. But the solution is not to make them non-lazy. In cases where a 'getter' is available to synchronously fetch the current value of the property (like in the Bacon.js example of window.location.hash), it's simply a matter of checking whether the observable is active and, if not, quickly update the current value before giving control back to the caller.

If the only available 'getter' is asynchronous, the problem is fundamentally different, and the solution deserves a fundamentally different interface, like one based on promises.

rpominov commented 9 years ago

I think that lazy subscription is fundamental to a good FRP library, especially one such as Kefir, which is specifically designed to be fast. So I don't think you should make 'active' the default state of any kind of observable.

I agree, but still worth consideration. I hope we end up with better solution (or separate solutions for separate aspects of the issue). There is already some interesting ideas in Bacon.js thread.

Ideally, you want to have the notion of active/inactive/lazy encapsulated away. Observables should always act as though they are active, and be internally lazy whenever possible.

That would be awesome, not easy to do though.

There's really only a potential problem with laziness and Property

In fact, no. Let's look at .take for example. It takes n events, but not counts events in hibernation period. In other words, what events is emitted depends on when it gets/loses subscribers. I'd say this is the definition of the issue: "In some observables emitted events depend on when the observable becomes active/inactive". It might cover all characteristics of events: time when they being emitted, content (the value or error itself), even amount of how many events are emitted.

Furthermore, if we use that definition, it also applies to all setInterval based streams as we call setInterval at activation and clearInterval at deactivation.

Having all this in mind, making idea of removing laziness not such a bad idea :) But it still pretty bad, I agree.

Also after describing all this, I'd want to remind that for now it can be solved with easy workaround — by adding a dummy subscriber to force the observable to be active.

rpominov commented 9 years ago

Here is the list of all currently affected observables (of methods that produces them):

Maybe there is more. I've included only those that certainly are affected and "in a bad way" :)

mhelvens commented 9 years ago

There's really only a potential problem with laziness and Property

In fact, no. Let's look at .take for example. It takes n events, but not counts events in hibernation period.

You're right! I wasn't thinking very clearly there. Anyway, what I said about the 'ideal' solution still holds: Observables should always act as though they are active, and be internally lazy whenever possible.

In the case of .skip, for example, the subscription would be active at least until the amount of events to be skipped has past. After that it would be lazy based on user subscription. Same for .take, except that stream ends immediately afterwards, so it is active during its whole lifetime.

To put it another way: Observables are always lazy, but the internals of Kefir can sometimes take their own subscription in order to ensure the correct behavior. And I suggest you actually implement it that way.

rpominov commented 9 years ago

I think if we make everything lazy only whenever possible, it will be really rare case when an observable will be inactive in whole system. So it not very different from simply make everything active by default, which, btw, is much easier — implementing that smart laziness is a lot of work. So if we decide to go this way, i'd better go with all active by default.

But have to say, I'am not happy with making anything active until user explicitly make it so. It'll break automatic resource management, which is a one of core ideas behind Bacon/Kefir/Rx family. Now if you have a variable pointing to an observable, you can safely do a = null and, if you unsubscribed from it, be sure that all observable resources will be garbage collected (unless some other module using that observable of course).

We may decide to make all (or some) observables active by default (or at some periods) only if we come to conclusion that we not lose too much by doing so, for now it's not the case I am afraid.

mhelvens commented 9 years ago

I haven't thought about this very deeply yet. When I have time, I'll have a look at your list there, to see how I might handle them, because it's an interesting problem. I suspect many stream-types could still stay inactive, or 'partially inactive', much of the time. (With 'partially inactive' I mean that it does not need to subscribe to its source stream, but may have, for example, a clock running.)

Generally speaking, though, I feel that correct behavior trumps 'convenient resource management'. And you've convinced me that, strictly speaking, many of those methods do not currently exhibit correct behavior. For resource management, you might make it a policy to call a.destruct() (or whatever) before doing a = null, if it comes to that.

rpominov commented 9 years ago

One can't call a.destruct() because he doesn't know if other modules use that observable or not :) The current system handles resource management pretty nice.

mhelvens commented 9 years ago

One can't call a.destruct() because he doesn't know if other modules use that observable or not

sigh True. The downside of tracing garbage collectors. For truly proper resource management, you need a reference counting system. But implementing one for Kefir seems overkill.

Anyway, .destruct() would of course start by properly ending the observable, and it should be called only by its 'owner', of which it is assumed there is only one.

The current system handles resource management pretty nice.

Yes. But it has the downside of being functionally incorrect. :-) Like I said, in my book correctness trumps convenient resource management. Though the user should probably be able to override things

jamesknelson commented 9 years ago

Just to add to the discussion, I'm using a custom "Model" component to solve one instance of this problem. It doesn't solve all the issues (e.g. take still won't take unless the stream is active), but it does allow for properties whose values can be updated even while not active.

I'd say the primary situation where this applies is you want to listen to changes on some data you have control of. The naive way to do this would be:

var bus = Kefir.bus();
var prop = bus.toProperty(1);
bus.emit(2);
prop.onValue(function(x) { console.log(x); });

1 is logged, but we want 2!

My solution:

function Model(x, is) {
  Property.call(this);
  this._current = x;
  this._is = is || Object.is;
}

inherit(Model, Property, {
  _name: 'model',
  get: function() {
    return this._current;
  },
  set: function(x) {
    if (!this._is(this._current, x)) {
      this._send(VALUE, x);
    }
  },
  update: function(fn) {
    this.set(fn(this.get()));
  },
  dispose: function() {
    this._send(END);
  }
});

Kefir.model = function(x, is) {
  return new Model(x, is);
}

Example usage:

var model = Kefir.model(1);
model.set(2);
prop.onValue(function(x) { console.log(x); });

2 is logged, as expected.

rpominov commented 9 years ago

I like the Model abstraction, it's basically emitter and property squashed together, so there is no problem of property losing values from emitter. It also introduces nice opportunities, such as possibility of implementing .update and lenses. Going to add it to Kefir in near future.

rpominov commented 9 years ago

Another thing worth mentioning here, the incorrect behavior of setInterval based streams actually pretty useful sometimes.

Consider you want to add one value at the end of some stream but with certain timeout between end and that value. Now it's easy: stream.concat(Kefir.later(1000, 1)). It works because .later will start timer when it gets first subscriber, which happen only when stream end, thanks to .concat. But if .later will always behave correctly, and start timer at moment on which it was created, line above will work deferent: depending on time when stream ends, result stream will not have 1 at all, or have it with deferent from 1000ms timeout.

mhelvens commented 9 years ago

I think it would be tricky to formally specify such behavior, and hard for people to understand.

There are more explicit ways to do this. I gave one in #51. You might simplify this for the user by accepting a function as an argument in .concat:

stream.concat(() => Kefir.later(1000, 1))
shamansir commented 9 years ago

I really like the stream.concat(Kefir.later(1000, 1)) way and in a truly a lot of cases in my code I get benefits from lazy streams—I prepare them to use them later and then I may start them with just one command. May be I am not getting the thread correctly, but for statistics, I got really used to this technique and I find it quite straight, so at least I wouldn't expect any combination of streams to be active by default in Kefir.

appsforartists commented 9 years ago

I'm a new user of Kefir (but an experienced JavaScript developer), and the laziness of properties confuses me. I understand why you would be able to throw away values from a stream ("If a tree falls and no one hears it…"), but the whole point of a property is to cache its last value - otherwise, it's a stream.

Consider this code:

function makeProperty () {
  var emitter = Kefir.emitter();
  var result = emitter.skipDuplicates().toProperty();

  emitter.emit(initialValue);
  return result;
}

var property = makeProperty();

var combinator = Kefir.combine(
  [
    property,
    //...
  ],

  (propertyValue, /* ... */) => doSomethingWith(propertyValue)
);

The combinator will never be called because property hasn't triggered yet (even though there's an explicit emit call in the constructor).

Requiring an onValue attachment in makeProperty for combinator to be populated is really unintuitive and feels broken.

appsforartists commented 9 years ago

In fact, the above code wouldn't trigger even if the properties were active because combine only cares about things that trigger after it's been declared. That also feels broken.

rpominov commented 9 years ago

combine only cares about things that trigger after it's been declared

This is not completely true. When you combine properties, combine handles current values from them.

var emitter = Kefir.emitter();
var property = emitter.toProperty();
property.onValue(function() {}); // activating the property
emitter.emit(1);

Kefir.combine([property, Kefir.constant(2)], function(a, b) {return a + b;})
  .onValue(function(x) {console.log(x)}); // => 3
Macil commented 9 years ago

Sometimes I find that I want a stream to start off inactive, but once it becomes active once, I'm fine with it staying active after that. That seems to work around a lot of weirdness with properties, though I haven't put too much thought about whether I'd always want that with all properties or even streams in general.

cbaatz commented 9 years ago

The trade-off here seems to be between performance-by-default and ease-of-use. Performance-by-default is what Kefir does now. Ease-of-use would mean exposing semantics that are simple for developers to use and I think that probably means @mhelvens lazy only when stream is not stateful.

Acquiring a correct mental model of Kefir's current behaviour isn't hard, but using it is because of the need to mentally model the activation states of streams/properties at different stages of the execution. For example, if I do a .scan() to fold a state which I later .take() and skip(), I need to be careful to do the .skip() before the .take() to avoid .scan() potentially missing out on an event (which would turn the state invalid going forward). Alternatively I can remember to add a dummy subscriber. This is a pretty big gotcha that can produce subtle race-condition issues.

If we want optimal performance and resource management, then we have to expose some variation of activation to the developer. However, it's not clear to me that doing this by default is a compelling trade-off. The non-default way of doing this might be to create optionally inactive sources, but I haven't thought carefully about this.

This thread (and the referenced ones) have compelling arguments for activating stateful streams by default, whereas the arguments for performance-by-default seem more hypothetical to me (though I could be wrong about that!).

All this said, there's probably a good argument to be made that such a change to the fundamental trade-offs should be implemented as a different library..