twitter / algebird

Abstract Algebra for Scala
https://twitter.github.io/algebird
Apache License 2.0
2.29k stars 345 forks source link

Add conservative update to CMS #460

Open johnynek opened 9 years ago

johnynek commented 9 years ago

See: http://dl.acm.org/citation.cfm?id=1868768

If you estimate the new size, we certainly don't need to update any buckets to a value larger than that. This appears to be non-associative, but could be useful as a method like:

def conservative_+(kv: (K, Long)): CMS[K]

also, we could consider a variant that does this by default when merging sparse values.

sritchie commented 8 years ago

This paper describes the conservative update as well: http://dimacs.rutgers.edu/~graham/pubs/papers/cmencyc.pdf

We'll want to note the caveat at the end:

image