Closed rebcabin closed 12 years ago
Can you explain a little more about how you're not looking for the equivalent to
_.flatMap = _.compose(_.flatten, _.map)
This is just Haskell's concatMap
, which is just (concat . map)
.
The issue I found is that underscore's flatten
flattens all levels of its input, whereas Haskell's concatMap
flattens exactly one level of its input. This difference is critical, because it means that Haskell's concatMap
is equivalent to a monadic bind, in the sense that concatMap someFunc someList
equals someList >>= someFunc
. This may seem like a nit or a feckless generalization, but without a proper flatMap
(or concatMap
, if you prefer), we have a non-monadic library of list operators and miss the mark of full compositionality. In particular, without it, we must leave the library to get the "list of all line items from customers" from composing an operation that gets "the list of all orders from customers" and an operation that gets "the list of all line items from orders." With a proper monadic flatMap
(or concatMap
), we just say
customers.flatMap(customer -> customer.Orders).flatMap(order -> order.lineItems)
If flatMap
flattens too much, though, we can't get a list of orders out of it -- we've gone too far with the first flatMap(customer->customer.Orders)
because it too aggressively flattens the substructure of lineItems
.
A little monadic discipline goes a long way. Having a proper flatMap
that flattens exactly one level makes the underscore library equivalent to Haskell's list monad or to .NET's LINQ over lists. It means the underscore library as a design is, in principle, extendable to other monads, most importantly to asynchronous streams of data. This ups its ante tremendously, in my opinion. Imagine a library with exactly the same operators as underscore's, but just works over asynchronous streams. This would be equivalent to .NET's reactive framework (detail: the reactive equivalent to a list is an observable
that exposes a subscribe
method, and observers
subscribe their onNext
, onError
, and onCompletion
callbacks to observables. The monadic library will allow us to filter, transform, expand, etc. the observables.)
(my remarks above are predicated on modeling the data as nested arrays -- as I noted before, if the data is modeled as alternating arrays and objects, then the composition of underscores flatten
with map
accidentally works because flatten
won't cut through the next lower object layer -- the data model already presumes a list with one level of structure accessible to flatten
. I reckon this to be a lucky accident that pertains only to one conventional way of modeling data.)
The issue I found is that underscore's
flatten
flattens all levels of its input
Ah, I wasn't aware of that. Now I understand.
I don't like that _.flatten
behaviour. I propose that _.flatten
is made shallow by default so that _.flatMap
can be implemented as I listed above. I know it will be a breaking change, but it was a mistake to implement it that way. For now, _.flatMap
can be implemented as:
_.flattenOnce = function(arr){
return _.reduce(arr, function(memo, b){ return memo.concat(b); }, []);
};
_.compose(_.flattenOnce, _.map);
There is a secret "second" argument of the original flatten
that I exploited in my implementation of flatMap.
I discovered it by looking at underscore's source. Take a peek at my fork if you have a sec :)
https://github.com/rebcabin/underscore/commit/4f90bab75d8f69df8d8d0317354f9f5981e2fedb
Yep, I saw that when I looked at the source of _.flatten
after you mentioned it performed a deep flatten. That's why I suggested that we make _.flatten
shallow by default. I see that it has the parameter, but it defaults to deep flatten behaviour. I'd also be okay with defining an _.concat
that behaves as _.flatten
except with that second parameter defaulting to true
.
Also note that Mathematica has a hyper-generic flatten
function that permits full control over flattening at all levels. I do a lot of Mathematica work, and I often use the equivalents of flattenOnce
and flattenAll
. But it's occasionally nice to be able to reach in and manipulate internal levels. Probably too much for a utility-belt like underscore, but I do think that both flattenOnce
and flattenAll
are really useful.
Ah, ok, I get you Michael. I would concur on changing the default of flatten
if the breaking change isn't too heavy on users. That's a judgment call I can't make :)
I'm afraid that the breaking change would be too heavy -- Underscore is a fairly pervasive library, and I'd like to put a high premium on backwards compatibility.
... but how about simply documenting the second argument to flatten
? Isn't that what we're looking for here?
to correspond to Haskell's "bind" and LINQ's "SelectMany." This is a fundamental operator that, in one call, applies a transformation to each element of its input collection -- a transformation that produces a collection of a second kind; and then flattens the result exactly one level. Without it, there is no primitive way to expand a collection ("map" lets you transform a collection, "filter" lets you reduce a collection, and "fold" lets you burn down a collection to a result; but "flatmap" is the fundamental one that lets you build up a collection from elementary inputs. It's fundamental in the monadic sense: it makes your collections into a proper monad with all the pursuant composability benefits.
Canonical example, from a list of customers, each of which has 0 or more orders, each of which has zero or more line-items, extract the list of all orders, preserving the underlying line-item structure but giving up the customer info. Compositionally (in another 'dotted' operation), product the list of all line-items
In a programming pattern of alternating arrays and objects, "map" and "flatten" will (somewhat accidentally) get the job done because "flatten" doesn't flatten objects, only arrays. However, if the data were modeled with pure arrays, I see no good way to do this (without the secret, second argument of "flatten") because the first flatten would blow out all substructure.
I implemented a proposal for flatmap in a fork of underscore's source, along with a few unit tests for your consideration. I don't have docco running and I didn't produce a "min" version of the javascript, so it's just the javascript and the (fairly uncomprehensive) unit tests.
https://github.com/rebcabin/underscore/commit/4f90bab75d8f69df8d8d0317354f9f5981e2fedb