addthis / stream-lib

Stream summarizer and cardinality estimator.
Apache License 2.0
2.26k stars 558 forks source link

Implement LogLog-Beta/β (new cardinality estimation algo) #125

Open otisg opened 7 years ago

otisg commented 7 years ago

A new paper released this month introduces a new cardinality estimation algorithm called LogLog-Beta/β:

https://arxiv.org/abs/1612.02284

"The new algorithm uses only one formula and needs no additional bias corrections for the entire range of cardinalities, therefore, it is more efficient and simpler to implement. Our simulations show that the accuracy provided by the new algorithm is as good as or better than the accuracy provided by either of HyperLogLog or HyperLogLog++." Some comments about its accuracy (graphs included) can be found in this PR: https://github.com/antirez/redis/pull/3677

abramsm commented 7 years ago

Excellent. Thanks for sharing. I'd like to see this implemented in stream-lib and will look to do so in the next few months unless someone beats me to it.

On Tue, Jan 24, 2017 at 9:16 PM, Otis Gospodnetic notifications@github.com wrote:

A new paper released this month introduces a new cardinality estimation algorithm called LogLog-Beta/β:

https://arxiv.org/abs/1612.02284

"The new algorithm uses only one formula and needs no additional bias corrections for the entire range of cardinalities, therefore, it is more efficient and simpler to implement. Our simulations show that the accuracy provided by the new algorithm is as good as or better than the accuracy provided by either of HyperLogLog or HyperLogLog++." Some comments about its accuracy (graphs included) can be found in this PR.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/addthis/stream-lib/issues/125, or mute the thread https://github.com/notifications/unsubscribe-auth/AABfayyFtWNPBRG_Lk9yBEpaRLz4OkOcks5rVsyZgaJpZM4LtFMQ .