haskell / statistics

A fast, high quality library for computing with statistics in Haskell.
http://hackage.haskell.org/package/statistics
BSD 2-Clause "Simplified" License
300 stars 67 forks source link

Lens based API for statistics calculation (mean, stddev, etc) #162

Open Shimuuar opened 4 years ago

Shimuuar commented 4 years ago

This is one way to resolve #146. All examples here will use mean but readily generalizes to all fold based estimators and with some problems to quantiles and like which will require building intermediate vector.

Current implementation of mean have following signature: mean :: (G.Vector v Double) => v Double -> Double and as such could only work with vectors. Which is very inconvenient. Proposal is to take Getting for data structure as parameter

meanOf :: Getting (Endo (Endo MeanKBN)) s Double -> s -> Double

This surprisingly powerful API. Here are examples

  1. meanOf each will compute mean of instance of Each type class that it will work for vector, lists, Maps and lot of other data structures
  2. meanof folded will work for any Foldable
  3. meanOf (each . each) will compute mean of nested containers
  4. meanOf (each . filtered (>0) . to log) will compute mean of logarithm of every positive number.

So this is very powerful and generic API. Preliminary benchmarks indicate that it's possible to get performance identical to current implementation. Only downside it requires use of lens but I think gained power worth it.