welch / tdigest

tdigest: javascript implementation of Dunning's T-Digest for streaming quantile approximation
MIT License
69 stars 11 forks source link

[Suggestion] Small exports for distributed programs #3

Open SGrondin opened 8 years ago

SGrondin commented 8 years ago

Hi,

This is a fork used in large distributed programs where I work. It adds a Distributable class that inherits from Digest. The purpose of that class is to minimize the size of the exported state (toArray) so that a node wanting to read a percentile value can fetch lots of small internal states from each node and recompute the percentile quickly.

It implements toList(), which is a more compact version of toArray(). It uses arrays to save space on the countless mean: ..., n: .... The centroids can be pushed back into a new Distributable instance using .push(centroid[0], centroid[1]).

I have no idea if this would be useful to you or anyone else, but I'm opening this PR in case you find it interesting and/or want to merge it.

welch commented 8 years ago

Thanks! I'll give it a look later today.

SGrondin commented 8 years ago

To be honest, I think it's a very narrow use case, the settings are hardcoded and there's no tests. I would be surprised if you merged is as is. I opened the PR because if I do work on top of open source software I like to show the author how it's being used in case it gives them ideas :)

welch commented 8 years ago

As you say, it's not mergeable code (I'd hit you up for unit tests at least). But it's a pretty classy way to submit a feature request :smile:

I'll take this on so I can get you back to running the main line.

Thanks!