Add new consumer stream "collect"

jasonkuhrt commented 9 years ago

I would like to propose a new stream called collect for the https://github.com/cujojs/most/blob/master/docs/api.md#consuming-streams category.

It would be sugar for:

stream.from([ 'apple', 'orange', 'banana' ])
.reduce((xs, x) => { xs.push(x); return xs; }, [])
.then(console.log) // [ 'apple', 'orange', 'banana' ]

becomes:

stream.from([ 'apple', 'orange', 'banana' ])
.collect()
.then(console.log) // [ 'apple', 'orange', 'banana' ]

I have found the collect function useful in systems where there are inherit race conditions with getting data such that changes must be buffered and replayed in order after some async operation(s) complete.

Put another way, collect helps use FRP for micro ad-hoc queue situations. Very loosely speaking this touches a bit on what CSP is/does where everything is channels fronting queues that block on write or read (with opt-in semantic customization ala Clojure's API).

davidchase commented 9 years ago

Why not just extend most? that way we keep the core tiny.. :smile:

what are your thoughts on something like this?

var most = require('most');
var Stream = require('most').Stream;

Stream.prototype.collect = function(){
  return this.reduce((xs, x) => { xs.push(x); return xs; }, []);
};

most.from(['apple', 'orange', 'banana' ])
        .collect()
        .then(console.log.bind(console)); //=> [ 'apple', 'orange', 'banana' ]

jasonkuhrt commented 9 years ago

Ok +1

I had tried to extend Most before but ran into a bit of confusion. This is good enough for now.

In regards to a lean core, it seems like a most-engine should be extracted which has a vastly smaller API than this one.

briancavalier commented 9 years ago

Some interesting ideas here, @jasonkuhrt and @davidchase!

Personally, I haven't needed a collect combinator often enough to feel like I could justify adding it. If it seems to keep coming up, though, we should def discuss.

However, it does make me think that there might be some related combinators that are also worth considering:

a last combinator that returns a promise for the last event in the stream has come up a few times.
a collect-like API which, instead of returning a promise, returns a new stream containing 1 event: an array (or iterable) of the collected events. Then, .collect().last() gives you the ability to return a promise, but also allows using the two features independently if you need.
Some sort of shortcuts for buffering by count or time (or both). Similar to collect, but instead of collecting all events, uses some sort of signal (i.e. another stream) to effectively group events into buckets and emit the buckets.

Any thoughts on those?

As for most-engine, yeah, as most.js has grown, I've been thinking something similar. In my head, I was calling it most-core, but the same idea.

What were you thinking would be in this smaller core?

davidchase commented 9 years ago

@briancavalier those ones you mentioned are all interesting, but i think the question still pertains is it possible or of interest to create something like most-collect which is a separate from the most-core..

Maybe most-core should just have the bare minimal: something like map, join, ap, of, merge, etc

So something like this: since constant is really just map over something a return just x we can create a most-constant or have the users simple do

most.from([1,2,3]).map(() => 1).forEach(console.log); //= > 1 1 1

Or most-flatmap which can be done with just a map + join...

just throwing some ideas out there, maybe if you can "compose" with the core modules the result should be a separate module.

if that makes any sense :stuck_out_tongue:

jasonkuhrt commented 9 years ago

I can't speak to performance but if you look at how a core like prelude (Haskell) is built it is interesting. Only a handful of primitives are needed, at least those that implement the functor applicatiatve functor and monad interfaces.

In practice, especially a language like JavaScript, the elegance of building up libraries from core is thwarted by the free performance (for users) being left on the table. The reason for this is that the baseline in this case (JavaScript) and FRP are quite far apart whereas FRP is a much more natural implementation on Haskell where you are forced to not mutate, be lazy, etc.

When the platform and abstraction implementation are close then you can cheat less or not at all. In the JavaScript world it seems like from experiences like Ramda we see that it is not practical to prioritize implementation elegance. Sadly this therefore seems to resist proper modularity.

JS 2015 Generators and TCO are important language additions to reducing this gap but are still a far cry from other modern platforms today.

jasonkuhrt commented 9 years ago

@briancavalier Great thoughts/points. Each of your proposed library additions/directions for exploration are interesting to me. I can't speak to it much yet because of my still-limited experience with most and generally limited practical time with FRP (have a bit of theory).

One thing I will say as a gut feeling; More work in exploring FRP for concurrent programming.

A lot of the FRP literature talks in terms of GUI and certainly this is double so when FRP is discussed in the JavaScript community where its [the language] primary niche/monopoly is GUI. However, concurrency is inherit and universal and the reason for abstractions like CSP. We can see Go, a server-side niche language that built CSP into the core and requires all thinking to be done in terms of it. And knowing that FRP is just a higher abstraction than CSP (and to my knowledge can actually be modelled on top of CSP, e.g. csp.js) it begs the question what would programming servers using only FRP look like?

I presume a powerful, granular, performant, and necessarily simple queue API in terms of FRP is the primary thing missing right now. Again, this is just gut feeling for now.

jasonkuhrt commented 9 years ago

@briancavalier

One piece of concrete feedback: .collect().last() seems intuitive :+1: .

briancavalier commented 9 years ago

One piece of concrete feedback: .collect().last() seems intuitive

@jasonkuhrt thanks!

In the JavaScript world it seems like from experiences like Ramda we see that it is not practical to prioritize implementation elegance

If you really want to push the perf envelope, this is definitely true. I'd also argue that it's true to some degree in any language, but perhaps more so in JS than some others due to the way VMs are implemented, and what/how they choose to optimize. That's unfortunate, but reality: if you want the absolute best performance, you have to write code that contains at least some knowledge about how the compiler will optimize it. So, as always, there's a tradeoff.

So far, the goal has been to push performance as far as possible. I think we could probably pull back from that a bit and simplify a few things.

is it possible or of interest to create something like most-collect which is a separate from the most-core

Yes, it's already possible. In fact, all of the existing combinators are essentially "separate" and only rely on the Source and Sink APIs, which are stable (tho currently undocumented :( ). It's all done via composition instead of inheritance atm. All of the instance methods are also available as pure functions, e.g. most.of(x).map(x => x+1) is equivalent to most.map(x => x+1, most.of(x)). So, it's possible to write pure functions in different repos that provide new functionality.

The question in my mind re: most-core is: What is the actual purpose of it? Do we really see people using something like most-core by itself? Is it realistic to expect that someone will build something, other than "most.js proper", on top of it?

Or is the purpose simply to modularize most.js itself to allow consumers to pick, a la carte, groups of functionality that they want/need?

I don't know the answers to those, but would love to hear your thoughts!

jasonkuhrt commented 9 years ago

@briancavalier If you start a Reactive A+ spec : D then I could see a clear justification for most-core being a spec implementation that e.g. can be shared upon by kefir rx bacon etc. Until such a time I don't really see the reason assuming that most is open to adding small but commonly used combinators and also is easily extensible at the project level (I think the prototype technique mostly achieves this but it does not extend the Pure API does it?)

jasonkuhrt commented 9 years ago

At least one thing I am interested in seeing is a curried Pure API along with some composition helpers like pipe and compose that would expedite the pure style. Currying is the reason ramda is way more readable and convenient than something like lodash (unless you use its chaining API to take advantage of JavaScript "native" composition .).

briancavalier commented 9 years ago

@jasonkuhrt I'm with you on pure functions + currying. I prefer it over chaining, but I think the JS community at large tends to prefer chaining. BTW, have you seen #126 ?

briancavalier commented 9 years ago

Also see #30, which is another approach to currying that preserves function.name, function arity, and parameter names.

davidchase commented 9 years ago

@briancavalier i like the al carte approach similar to what flyd is currently doing by providing a small readable core with some basic functionality.

Ramda recently had a few PRs to remove some functions that can just be composed together with the core functions.. ie: before mapIndex now map + addIndex not the best example maybe because its pretty extensive library but still they are having a similar discussion.

IMO it provides a small core to easily reason about but it also opens the floor to creativity to see what else can we create.. in recent chat with a developer he was intrigued that you can do map = compose(ap, of) or that a simple compose (f, g) => (...args) => f(g(...args)) will only take two functions but if we do compose = (...args) => reduce(compose, args) in which we reduce over the above compose we can now take more functions in the pipeline...

thoughts?

jasonkuhrt commented 9 years ago

Ah thanks for the issue links. I can follow up in those threads then at some point.

the JS community at large tends to prefer chaining

Yeah. I think the happy medium is to support both. Its a bit harder for ramda since that would mean extending JavaScript natives (or requiring the strange R([1,2,3]).do().stuff() idiom) but for libraries introducing their own data structures such as most or https://github.com/facebook/immutable-js/ that "native" problem does not exist and so both APIs can be supported happily.

Aside: I have a brewing interest in creating a foundational prelude based on combining code/ideas from immutable (for data structures) flow (for types) folklore (core abstractions like Maybe, Either) most (Async) purry (currying / partial application) but at some point it begs the question of why I don't just go write in purescript or ghcjs ha.

jasonkuhrt commented 9 years ago

@davidchase My fear is that the modularity you refer to risks the "elegant implementation" problem (taking the core and extending it in elegant compositional ways). I am no perf expert so @briancavalier can speak authoritatively on this but I feel like a super-high-performance FRP is really really important to JS because we need something that can be legitimately used for e.g. creating TCP servers not just UI which I think is hardly the monopoly of FRP.

I'm all-for modularization but I think its a high tax to pay that requires a clear plan beforehand. Will there be a spec? Will there be perf guarantees / guides? Will the core be typed (e.g. https://github.com/facebook/flow)? etc.

davidchase commented 9 years ago

I would like to get @briancavalier opinion on the matter rather than it turning into a battle of a vs b you giving me examples of "fat projects" and i will send back modular ones that accomplish the same thing.

briancavalier commented 9 years ago

I'm all-for modularization but I think its a high tax to pay that requires a clear plan beforehand

Personally, I'm highly in favor of modular design in general. I'm not a fan of monolithic libs, even though I've written my share of them!

In my experience, finding the right level of modularity is important, i.e. how much functionality is enough?

The other thing is finding the right way to export an API that is easy to work with while also flexible and performant enough to enable other folks to build awesome stuff with it. At the same time, you have to try to make sure the API isn't easy to make mistakes (for example, exposing too much of the internals), or too clumsy to work with. I'm wrestling with this on another project currently.

You're right @jasonkuhrt, getting that API right takes a clear plan, which, in turn, usually requires a lot of thinking and discussion ... or a eureka moment :)

I think most.js is actually in a pretty good spot with its Source/Sink APIs--it just needs docs. I should really do this :)

The question in my mind is still: What is the right set of base functionality that belongs in a most-core that would make it useful on its own and as a solid cornerstone for folks who want to write new combinators?

As for functions vs. chaining, my personal preference is functions. IMHO, chaining reads nicely, but makes extensibility harder because you basically have to rely on inheritance (if the lib is coded in a way that supports it, which most.js currently is not) instead of composition, or you modify prototypes directly and hope that no one else is doing the same thing. Most.js's function API is already curry-friendly ... i.e. you could curry it with your favorite currying function if you want. But, it's probably nicer if we pre-curry it for you?

Perhaps a core set of methods for chaining (the required fantasy-land monoid, functor, applicative, monad methods?), plus pre-curried functions for everything else?

cujojs / most

Add new consumer stream "collect" #156