typelevel / cats

Lightweight, modular, and extensible library for functional programming.
https://typelevel.org/cats/
Other
5.25k stars 1.21k forks source link

Reducer typeclass #1360

Open benhutchison opened 8 years ago

benhutchison commented 8 years ago

Reducers model append and prepend operations on Monoidal containers.

Described on slide 9 of this talk by Kmett

See also haskell lib: https://hackage.haskell.org/package/monoids-0.3.2/docs/Data-Monoid-Reducer.html#t:Reducer

johnynek commented 8 years ago

I think Aggregator is a better type:

https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/Aggregator.scala#L325

which is also an Applicative on the result type.

Algebird uses Aggregator and MonoidAggregator pretty heavily.

benhutchison commented 8 years ago

Aggregator seems related but not comparable to Reducer.

As explained in the slides, the key use for Reducer is efficient prepend/append of an element to a container.

That doesn't look like the intent of Aggregator, which with 3 types and 3 interlinked operations, has a more complex structure (..that may also be useful, I just don't know).

johnynek commented 8 years ago

Aggregator has append for that purpose (same as cons in Reducer), but it lacks snoc. It is the same as this type: https://hackage.haskell.org/package/folds-0.6.2/docs/Data-Fold-M.html

Reducer is an Aggregator with the final function (present it is called) being identity. Put another way, Aggregator (or Data.Fold.M) is Reducer combined with a free Functor.

Some interesting examples of Aggregators:

average of numbers: https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/AveragedValue.scala#L73

approximate unique counts with hyperloglog: https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/Aggregator.scala#L271

approximate percentile: https://github.com/twitter/algebird/blob/develop/algebird-core/src/main/scala/com/twitter/algebird/Aggregator.scala#L278

none of those work well without the final function to map from the monoid structure onto the value you want.

ghost commented 7 years ago

:+1: That Aggregator is a useful type and a tentative :+1: that it should ideally live separate to cats(-core).

Question is then, does a (maintained) cats version already exists somewhere and if not, where should this and potentially other related types live?

benhutchison commented 7 years ago

@etorreborre has https://github.com/atnos-org/origami built atop cats

Fold is similar to Aggregator in some ways, but allows monadic effects while folding so its fold operation (S, A) => M[S] (folding down As) is non-parallelizable unlike the Semigroup in Aggregator.

etorreborre commented 7 years ago

By the way the origami library is more or less doing what the foldl library is doing in haskell.