Generalized `sampledBy` for dealing with glitches by hand

rpominov commented 10 years ago

As glitches issue not fixed completely (https://github.com/baconjs/bacon.js/issues/353) I propose to introduce more generalized sampledBy for thouse people who want to deal with glitches by hand.

Here is the signature:

Bacon.sampledBy(
  [streamA, streamB...], 
  [samplerC, samplerD...], 
  (a, b... c, d...) ->
    ...
)

It is basically same as stream.sampledBy(sampler, fn), but with multiple streams and sampler streams.

How it may help with glitches? For example you have four streams A, B, C, and D. And you know than A and B has common original stream (they always trigger simultaneously), and so do C and D. And you need to combine all those four streams. Having Bacon.sampledBy you can instead of doing this:

Bacon.combineWith (a, b, c, d) ->
  ...
, A, B, C, D

do:

Bacon.sampledBy [a, c], [b, d], (a, c, b, d) ->
  ...

And have same result as with combineWith, but without need of inspecting dependencies and trying to get rid of glitches automatically.

What do you think? :)

raimohanska commented 10 years ago

Well that makes sense indeed. I've often felt the need for an easy way to say "i need the values of x,y,z and updates on x only". This method should be pretty straightforward to implement too. Wanna try?

Are you suggesting we remove the automatic glitch prevention mechanisms? :) They've proved quite a lot harder to implement correctly than it first seemed, I have to admit.

rpominov commented 10 years ago

I can try to make a PR, but not very soon (on next week probably).

I would keep current limited mechanism, but wouldn't try to extend it to streams with dynamic dependencies (like flatMapLatest), at least until we have good algorithm for that.

philipnilsson commented 10 years ago

Hey guys,

I currently do not really have a very good understanding of Bacon's approach to glitch prevention. I'm not sure there is any way of completely removing glitches that doesn't require maintaining a dependency graph of there observables, and traversing that in its entirety on notifications. Any paper I've read on the subject uses this approach, with optimizations of special cases.

Bacon seems stuck in some kind of middle ground between non-glitchy/glitchy, and I'm not quite sure I completely support this design decision, since I've never quite understood the intended semantics.

Perhaps we're only aiming for a loosely defined sense of "prevent some glitches some of the time" for common cases, but this does feel hard to reason about. For instance if I wanted to prove the associativity law for monads, I'd be unsure of how to argue whether changing the order of composition might affect glitchiness. In fact I'm not sure I can even say exactly what the semantics should be.

I'd argue the right way to go about this is either

1) Come up with a strategy for preventing glitches entirely when possible, and introducing combinators such as mergeWith, that would allow you to combine a list of possibly simultanous events to resolve ambiguities. I think this might have significant effects on implementation and performance characteristics, and is not really viable.

2) Allow users to manually restrict when events happen via combinators such as zip, when and sampledBy as before the atomic updates refactorings.

Perhaps I only need to be educated on the intended semantics here, so take what I'm saying with a grain of salt. I do feel it's gotten slightly hard to contribute to Bacon (or at least the core) since this aspect of the library is a bit hazy to me, both in terms of semantics and implementation. Perhaps you could formalize what your vision of this feature is a bit @raimohanska?

raimohanska commented 10 years ago

@philipnilsson sorry for late reply. I've been too busy with other stuff...

Yeah, atomic updates thing proved harder that it seemed at first. There is indeed a dependency graph, but I'm using some (not as) smart (as I thought) optimizations to keep performance nearly on pre-0.7 level. I just fixed #363 with a simple fix, just like I fixed a number of issues before that. Still, issue #353 is open and seems non-trivial to fix without sacrificing performance. The problem there is that unlike all other streams, flatMapped ones have changing dependencies because new dependencies can be introduced for each source event.

So even though I think it's possible to fix #353 and probably any future findinds too, it comes at the cost of increased complexity.

Btw, the goal of Bacon.js is quite ambititious, compared to Rx and Elm for instance:

Has flatMap, and subscribers can be added any time (unlike Elm)
No glitches in diamondlike setups (unlike Rx)
Consistent EventStream/Property behavior (unlike Rx)
Lazy evaluation whenever possible (unlike Rx?)
Always single path to source observables, eliminates duplicate calls to map functions etc (unlike Rx)

This is not to undermine Rx or Elm, which have different and simpler approaches in the core (correct me if I'm wrong here):

Elm requires everything to be defined at first, before running anything. No flatMap, almost trivial to eliminate glitches. Impossible to implement Bus, for instance.
Rx doesn't force a Dispatcher for each Observable, but instead has a whole separate event dispatch chain from source to each subscriber. No central control, makes core simpler. No attempts at preventing glitches automatically. Provides a lot of tools for doing this manually.

Both are tools that do their thing very well. It's just that I want best of both worlds and that's challenging!

As for my vision of what to do with glitches. Not entirely sure:)

I think we should stress correctness over performance and eliminate glitches totally. This is not easy, but will provide the best user experience. On the other hand there might be room for a lighter-weight high-performance bacon without the glitch prevention mechanism.

staltz commented 10 years ago

Nice comparison of Elm, Rx, and Bacon.

One observation on "Lazy evaluation whenever possible (unlike Rx?)". Rx has lazy evaluation and they call it "Cold observables": http://www.introtorx.com/Content/v1.0.10621.0/14_HotAndColdObservables.html

raimohanska commented 10 years ago

Cold observables are not what I meant. Instead they are one of the problems that caused me to start writing my own library. Lazy eval in bacon means that combinator funcs are nit called until value is needed.

On 23.6.2014, at 10.12, Andre Staltz notifications@github.com wrote:

Nice comparison of Elm, Rx, and Bacon.

One observation on "Lazy evaluation whenever possible (unlike Rx?)". Rx has lazy evaluation and they call it "Cold observables": http://www.introtorx.com/Content/v1.0.10621.0/14_HotAndColdObservables.html

— Reply to this email directly or view it on GitHub.

staltz commented 10 years ago

Could you explain how combinator funcs on cold observables are not lazily evaluated in Rx? At first impression is does look like they are: http://jsfiddle.net/9EjSQ/54/

phadej commented 10 years ago

Do you, @raimohanska, mean f in observable.map(f) by combinator function?

I'd say there is huge confusion of terms: lazy evaluation vs. lazy propagation, is combinator function the combinator (e.g. map) or its parameter? etc.

You should really write a blog post about differences of Rx and bacon (and try to establish the vocabulary), or give a (recorded) talk ;)

raimohanska commented 10 years ago

Funny that this issue turned into a discussion on laziness. But let's discuss that :)

@staltz I didn't say anything about combinator funcs on cold observables. Just said that cold observables are not the thing I was talking about.

The laziness I'm talking about is for instance that f in map is invoked only when the value is needed. So if you result = stream.map(f).sampledBy(sampler), the function f will only be called for the values of that result actually outputs, not for all values of stream.

@phadej Yeah I should or someone should. Unfortunately I don't have the time right now. I feel bad for not being able to dedicate enough time for Bacon.js. On the other hand, no-one pays me to do so.

staltz commented 10 years ago

Understood now. I guess it's hard to compare the libraries when they seem like slightly different paradigms. Like there isn't sampledBy() in Rx, so it's hard to argue about lazy evaluation there.

raimohanska commented 10 years ago

I think there's sampledBy or equivalent in Rx for sure.

phadej commented 10 years ago

@staltz There is e.g. observable.throttle and observable.pausable where events are dropped without inspecting their value.

@raimohanska, I guess you can build sampledBy from combineLatest and pausable, or from flatMapLatest?

Yet one can argue if lazy valuation is that important, after all you can pass thunks around. That will need some boilerplate, but it might be good thing — to have lazyness explicit.

@raimohanska I'm still a bit confused. Aren't eventstreams similar to cold observables, i.e. deliver events only when there is subscriber. Or is subscriber here different, in Rx anything and in bacon.js nodes with onValue etc callbacks (or flatMap*)?

And similarly Rx's .replay(1) is kind-of bacon.js toProperty, but (cached) hot-observables are glitchy Properties?

EDIT: disclaimer: my understanding about Rx are based on ReactiveCocoa (long shot!), which I use for data-flow, i.e. with behaviorsubjects — which is almost against official best practices

raimohanska commented 10 years ago

@phadej both Rx and Bacon.js share the same idea of laziness wrt subscriptions: the observables only connect to their underlying data sources when there are observers. The difference here is that Rx observables by default establish separate connections for each incoming subscriber while Bacon.js always uses a single connection. In Rx you can use refCount to get the similar behavior as Bacon.js use by default.

Rx cold observables are such that output the same sequence of events to any incoming subscriber at any time. For instance, if you make an Rx.Observable.fromArray([1,2,3]), each incoming subscriber will get values 1,2 and 3. So, they cannot be described as event streams at all, because they repeat the same event at a different time for all incoming subscriber. In bacon.js, eventstreams always output the same value to all subscribers at the same time and they never repeat the same event later. The Bacon.fromArray([1,2,3]) stream is also "cold" in the sense that it doesn't start producing values until someone subscribes. The difference is that it'll produce the values to the first subscriber only, because repeating the same values would not be consistent with the definition of eventstream. (see semantics).

Here's my old rant about Rx.js that describes why thes Rx design choices pushed me to start working on a library of my own...

phadej commented 10 years ago

So if I have

var b = a.map(f);
var c = b.map(g);
var d = b.map(h);

then in on one event in a, the f will be called twice in Rx?

Maybe cause I played with behavioursubjects, I didn't fall into scan-trap. Though I observed the multi-evaluation, just weren't aware why it happened. This explains, great thanks!

P.S. Rx documentation improved over the years, we have to catch up!

staltz commented 10 years ago

@phadej, behaviour subjects (and other subjects) are by definition hot observables, so that's why. Nowadays I try to use .publish().refCount() whenever I can.

Any way, this is supposed to be about Bacon. Sorry for off topics.

raimohanska commented 10 years ago

@phadej I believe that would happen. I mean double evaluation in your case.

And yeah, bacon.js docs haven't evolved much. You've contributed some improvements though :)

phadej commented 10 years ago

I'm short on free time too. And there so much interesting things to contribute to! Though I'm making slight progress on showing types in docs more explicitly (the toggle). /offtopic

philipnilsson commented 10 years ago

Rx still has very simple semantics essentially as collections, in many ways in line with Conal Elliotts semantics of streams as lists of time-value pairs. The fact that Bacon used to be even closer to these semantics is what originally attracted me to Bacon, since Rx's hot vs. cold distinction is not in line with this.

http://conal.net/papers/push-pull-frp/

Now when I ask myself what the behaviour of some combinators are I'm not really sure anymore, since I can't think of the semantics in these terms. In a sense, events are now tagged with a "source" of some kind, and can be created and eliminated by some notion of "occuring at the same time" or "causing a glitch".

What are the semantics of the following stream?

var z = Bacon.fromArray([{x: 10, y: 20}]);
z.map('.x').merge(z.map('.y')).log()

In the current version it prints 10, and nothing else. On what basis do we assign these semantics. Is it even correct?

The other big issue is how to implement Bacon in a way where can be confident in the correctness of the library. In my opinion it's not ideal to rely solely on testing for qualify. What would it take to get everything working according to the hopefully soon specified semantics, also in the case of flatMap and dynamic event switching?

On Mon, Jun 23, 2014 at 11:37 PM, Juha Paananen notifications@github.com wrote:

@phadej https://github.com/phadej I believe that would happen. I mean double evaluation in your case.

And yeah, bacon.js docs haven't evolved much. You've contributed some improvements though :)

— Reply to this email directly or view it on GitHub https://github.com/baconjs/bacon.js/issues/383#issuecomment-46905618.

raimohanska commented 10 years ago

@philipnilsson the example with fromArray shows one of the pain points, i.e. synchrously responding sources. The correct output would be 10, 20 and the actual is just ten. This is covered in #367.

There's a limited number of combinators that are concerned by glitch elimination: sampledBy, combine, takeUntil I think. It might make sense to spell out the semantics.

I'm afraid automatic tests are the best guarantee of correctness we can get. Property-based testing with generated test data might provide more coverage.

To get everything working according to semantics? For me it is to fix known issues #367 and #353.

philipnilsson commented 10 years ago

Conal's semantics for streams is EventStream a = [(a, Time)].

In this semantic domain we can describe the merge operation as being the left-biased merge operation, like the one in the merge step of the merge sort algorithm, comparing on Time.

We can define the semantics of delay as delay d = map (second (+d)), and so on. This is of course only a way of specifying semantics, not an actual implementation.

What is a semantic domain where we can talk about things like glitch elimination? It seems like in this domain events are somehow tagged with their source, and some combinators will use this information to sometimes combine events.

As the problem with synchronous events show, the criteria is not one of events occuring at the same time. Perhaps the correct modification to the semantics is some notion of "infinitesimals" of time, such that in fromArray([1,2,3]), the events can be seen to occur at times 0, 0 + dt, 0 + 2*dt?

It might be that this would lead us to a place where synchronous events (in the js runtime) might not be possible?

phadej commented 10 years ago

TL;DR assigning sematics afterwards is not easy.

infinitesimals

@philipnilsson the infinitesimals as such would work.

If the events produced by Bacon.fromArray([0, 1, 2]) occur at t, t + dt, t + 2 dt, than it's easy to come with simple looking but semantically difficult situations:

var x = Bacon.fromArray([0, 1, 2]);
var y = Bacon.fromArray([3, 4]);

x.merge(y).log(); // ???

If events in x and y occur at t, t + dt, t + 2 dt and t, t + dt respectively, the output should be 0 1 2.

Yet currently the output is 0 1 2 3 4. It could be explained by the idea that events in y occur at t + 3 dt and t + 4 dt.

Than again, if we swap arguments of merge:

var x = Bacon.fromArray([0, 1, 2]);
var y = Bacon.fromArray([3, 4]);

y.merge(x).log();

the output is 3 4 0 1 2.

And this could be explained also by another idea that even in synchronous block t is not fixed, but floating. After all nothing happens if there aren't observers ⇒ the connection order matters.

fromArray is not trivial, one can say that sequentially would be enough. But there is fromBinder which gives user opportunity to do anything. And that one we want to keep.

x.merge(x)

As @raimohanska said

 var z = Bacon.fromArray([{x: 10, y: 20}]);
 z.map('.x').merge(z.map('.y')).log();

should produce 10 20.

In this case semantics for streams should be EventStream a = OrderedMap Time [a] (not OrderedMap Time a, this captures the uniqueness of time points better). and

merge = OrderedMap.mergeWith (++) -- concat the events

And again, difficulties. Properties aren't simple. If their sematics is current value is the last event occured in underlying eventstream, what is combineWith when there there are multiple events at single time point?

combineWith f = OrderedMap.mergeWith (zipWith f)
combineWith f = OrderedMap.mergeWith (liftA2 f)
combineWith f = OrderedMap.mergeWith (last . liftA2 f)

var x = Bacon.fromArray([0, 1, 2]).toProperty();
var y = Bacon.fromArray([3, 4]).toProperty();

Bacon.combineWith(function (l, r) { return [l, r]; }, x, y).log(); // this is really hard to guess!

But i definitely don't expect:

[2, 3]
[2, 4]

flatMap

All simple? and non-implementation driven ideas about flatMap and other dynamic event switching boil to postpone all side-effects until all events are propagated, where adding new observers is also an side-effect. Would e.g. help to understand what really should happen in https://github.com/baconjs/bacon.js/issues/363 jsfiddle example (not to say that side-effectful parameter to flatMap is sending user into "you are on your own" realm).

Cyclic graphs

With postponed side-effects (bus.plug()) should IMHO make an example https://gist.github.com/phadej/cb3d2bb6cd5bd793f073 work even without delay?

Also with flatMap you can make cyclic graphs. I'm too tired right now, but I guess you can write something like mfix for bacon, and rewrite example above using that?

philipnilsson commented 10 years ago

With the "infinitesimals" semantics I'd expect combining fromArray([0, 1, 2]) with fromArray([3, 4]) would give

[0, 3], [1, 4], [2, 4]

I have no idea how that could be implemented though, which is part of my problem with the current situation. I can't easily find semantics that are both reasonably clear and possible to implement. This would be problematic even if fromArray was not synchronous as far as i can tell.

With glitchy semantics the result of [2, 3], [2, 4] is completely reasonable though.

RoboTeddy commented 10 years ago

Here's an attempt to put FRP semantics on firm ground: http://haskell.cs.yale.edu/wp-content/uploads/2011/02/frp-1st.pdf

raimohanska commented 10 years ago

This issue has diverged into two

1) add a generalized Bacon.sampledBy 2) define semantics

The former is almost trivial (who's gonna PR?), while the latter has proven very difficult. The current implementations of once and fromArray don't create proper eventstreams as in EventStream a = [(a, Time)]. Neither do I think it's possible to make them so, at least very easily. I've been thinking about demoting these guys from EventStreams to mere Observables which they certainly are...

rpominov commented 10 years ago

Actually I made a PR https://github.com/baconjs/bacon.js/pull/387, but failed to finish the job. Maybe it not as trivial as it seems. Problem is merge doesn't work well with synchronous streams and properties.

May be it good as it is. Semantically speaking it not make much sense to sample something by synchronous events, or by property's current values.

I am completely ok if somebody make another PR, but if you help to finish mine it would be nice too :)

baconjs / bacon.js