Gozala / reducers

Library for higher-order manipulation of collections
MIT License
181 stars 8 forks source link

channels are harmful #23

Closed Gozala closed 11 years ago

Gozala commented 11 years ago

After think of channels and reducers I'm coming to conclusion that they are harmful. Initially there were only signals and hub could be applied to multiplex them. After some discussions with @gordonbrander he convinced me that most of a time multiplexing would be so common that it should be build in and that's how channels have being created. Now there were several things I disliked but was ok with:

  1. Ability to observe stream should not imply ability of pushing new values into it. If I hand several consumers some channel they could mislead each other by emitting events, closing or erroring it.
  2. One way data flows usually are easier to reason about, things that are both read and write have a potential to cause problem if consumer attempt so do read instead of write or vice versa. For example in reflex read returns channel that one is supposed to read from, but in fact you can also write into it and there could be cases where you get a channel and you don't know if you should read from it or write to it.

But other day @raynos was completely mislead as he reductions and other transformations of channel were happening several times since transformations do not create new channels that multiplex they create transformation objects, more simply transformed reducibles read from the source in this case from channel so each reduction read from the channel but all the transformations were happening down the flow causing multiple invocations of transformation functions. Me myself was the one who made this mistake, which made me realize that implicit multiplexing in some constructs is harmful as it makes you believe that it's always the case which is not. If it was always explicit one would have to think what are the transformation points where multiplexing is desired.

Gozala commented 11 years ago

Another option would be to make multiplexing part of all transformations, but then lot's of benefits will be lost. Not to mention along with a choice to not multiplex I happen to believe there might be cases like that. For example queues do are not supposed to multiplex if consumer a took item from a queue and consumer b took the item from the queue two different items should be taken from instead of one that they both get.

Raynos commented 11 years ago

If we don't have a channel I would like something like

I.e. a reducible construct that returns two values. one being a reducible and one being some object you can use to put things into the reducible.

read-stream returns a strictly readable stream that you can give to other people and a queue object you can use to put stuff in the readable stream.

pubit gives you an event emitter and allows you to create the read only part (i.e. on but not emit) to give to others.

Raynos commented 11 years ago

@Gozala as for the multiplexing part. I think a good compromise is wrapping a reducible in hub once but then somehow every other transformation applied on it afterwards wraps it in a fresh hub.

i.e. make multiplexing part of all transformations iff the source of the transformation has been multiplexed once.

I can't think of any behaviour where you want to multiple something once and then stop multiplexing down the line.

gordonbrander commented 11 years ago

On Nov 4, 2012, at 1:06 AM, Raynos notifications@github.com wrote:

@Gozala as for the multiplexing part. I think a good compromise is wrapping a reducible in hub once but then somehow every other transformation applied on it afterwards wraps it in a fresh hub.

That makes a lot of sense to me. Intuitively, I think there is an analogy to be made with array transformations, which always return a new array.

I don't have as deep an understanding of the pros/cons, but from a usability standpoint, if I can use the same reduce method on arrays and on reducibles, I expect it to work in much the same way for both.

Gozala commented 11 years ago

Nope map(array, f) does not returns array, it returns reducible which when reduced will go through the transformation pipe.

Gozala commented 11 years ago

If we don't have a channel I would like something like

pubit read-stream

I.e. a reducible construct that returns two values. one being a reducible and one being some object you can use to > put things into the reducible.

read-stream returns a strictly readable stream that you can give to other people and a queue object you can use to > put stuff in the readable stream.

pubit gives you an event emitter and allows you to create the read only part (i.e. on but not emit) to give to others.

That's where one would need an event and a signal which I'm refactoring now.

function read() {
  var e = event()
  setInterval(function() {
    send(e, Date.now()
  }, 100)
  return signal(e)
}

I'm also considering less imperative API which I'm not sure if it will work for all cases:

function read() {
  return signal(function(next) {
    setInterval(next, function() {
      next(Date.now())
    }, 100)
  })
}
Gozala commented 11 years ago

@Gozala as for the multiplexing part. I think a good compromise is wrapping a reducible in hub once but then somehow every other transformation applied on it afterwards wraps it in a fresh hub.

Nope imagine case I have an a multiplexed stream which has being transformed into b and then forked into c and d. Although fork of c is being wrapped into queue producing q if multiplexing is carried all the way q actually won't behave as on would expect.

Gozala commented 11 years ago

I don't think that ambiguity is a good compromise for usability, not to mention that there is performance cost associated with this. Only reason intuition makes you expect multiplexing built in is experience with EventEmitter which is wrong association for reducers, reducers are like function compositions think of following case (I'll use underscore functions for the purpose of demonstration):

// Calculation of fibonacci is slow so to multiplex :) and
// optimize access we memomize it.
var fibonacci = _.memoize(function(n) {
  return n < 2 ? n : fibonacci(n - 1) + fibonacci(n - 2)
})

// But actually we'll have bunch of use cases for
// fib number transformation
var data = compose(transform, fibonacci)

Now if you give the data to multiple users and they both call it both of them will compute the transformation! If you want to optimize and do transformation once you should've have memomized that too. Although you don't want to make it default because maybe transform does this:

function transform(x) {
  return { time: Date.now(), x: x }
}

In which case it actually should run transformation each time. This is much better analogy to how reducers compose transformations

Gozala commented 11 years ago

Also note that behavior of reducers is a same if you do b = map(array, f) and then invoke reduce(b, acc) twice f will run twice.

Raynos commented 11 years ago

From an FRP point of view

function read() {
  var e = event()
  setInterval(function() {
    send(e, Date.now()
  }, 100)
  return signal(e)
}

Feels unnatural.

Raynos commented 11 years ago

Only reason intuition makes you expect multiplexing built in is

I expect multiplexing build in because pipe handle's multiple targets cleanly. And my implementation of lazy pipe also handles multiple targets cleanly. i.e. you do the transformation once and it goes to all writers

Gozala commented 11 years ago

I expect multiplexing build in because pipe handle's multiple targets cleanly. And my implementation of lazy pipe also > handles multiple targets cleanly. i.e. you do the transformation once and it goes to all writers

My question is do you find analogy with a function composition I made helpful & is it clear now that there are cases where you would not want to multiplex ? Or are you still convinced it should be built in and inherited by transformations ?

Gozala commented 11 years ago

There is an alternative path which is little bit more in a spirit what you suggested but it makes a lot of things more complicated and I'm not sure it's worth it. Which is transformations could delegate to a source to do the assembly of the results which could result in desired behavior. The issues is that behavior becomes bound to source input which would make mean that depending on input one will either need to muliplex via hub or not. Which means that composition will need to have different structure depending on the input it will work with. This makes me believe that it will be easier to make mistakes because of false expectations on input source.

Gozala commented 11 years ago

Another good example of where muliplexing is undesired is here https://github.com/Gozala/fs-reduce/blob/master/test/write.js#L50-64 note how fileContent is reducible reduction of which causes fresh file open and reads it sort of makes composition equivalent of a function composition that can be invoked any time you like. With built in muliplexing subsequent reductions would end up into empty streams as first one would reach the end. This also makes data structures "inpure" in functional sense. As results of reduction on reducible becomes dependent on weather it's being reduced already or not or if it's in a process of reduction.

Gozala commented 11 years ago

I still would really like to convince you or be convinced into opposite myself, so lets keep this discussion alive.

Gozala commented 11 years ago

From an FRP point of view

function read() {
 var e = event()
 setInterval(function() {
   send(e, Date.now()
 }, 100)
 return signal(e)
}

Feels unnatural.

Yes it's terrible and encourages side effects, I just fear that alternative may not cover all use cases.

Gozala commented 11 years ago

I expect multiplexing build in because pipe handle's multiple targets cleanly. And my implementation of lazy pipe also > handles multiple targets cleanly. i.e. you do the transformation once and it goes to all writers

I'm pretty sure that if you pipe to one source and later (while piping is in process) to the other source it will cause surprises as second one would miss some data that has already being emitted. That's why I think choice should be explicit rather then implicitly inpure.

Raynos commented 11 years ago

My question is do you find analogy with a function composition I made helpful

yes.

Raynos commented 11 years ago

I'm pretty sure that if you pipe to one source and later (while piping is in process) to the other source it will cause surprises as second one would miss some data that has already being emitted.

That's expected behaviour. There data is in the buffer of the source. once you pipe it moves to the buffer of the target. If you pipe to multiple targets it moves the data to each one of those targets.

If you pipe later it will actually move the data it has which is not all the data it ever had (otherwise it would have to buffer the data forever)

Gozala commented 11 years ago

Channels and all kind of event based reducers are no longer part of core library.