johnmyleswhite / StreamStats.jl

Compute statistics over data streams in pure Julia
Other
48 stars 7 forks source link

extend StreamStats to generic types #15

Open gasagna opened 9 years ago

gasagna commented 9 years ago

Hi, first of all thanks for this useful package.

I was wondering if it could be possible to extend the package to handle generic types, instead of having it based on Float64 only.

In fact, I have a large number of snapshots from a simulation that do not fit all in memory and I need to compute statistics over them. Snapshots are represented by a custom type and I have already in place all the machinery for performing basic operations on them, say +-*/ and others.

I guess it would not be a too big change, as the required functionality should be exposed by the user code.

Davide

gasagna commented 9 years ago

However, there might be inefficient code being run.

Take for example the update mean

stat.m = (1 - α) * stat.m + α * x

If stat.m is a large array the above operation will incur in significant and unnecessary memory allocation.

johnmyleswhite commented 9 years ago

There's some infrastructure in this package's type hierarchy to do that already: it just needs fleshing out.

We definitely need to use specialized code for every concrete type, so we can't just generalize things and expect to keep up reasonable performance. We'll need to define specific MultivariateStreamStat objects, although we could possibly still use things like Mean() to construct those objects.