twitter / algebird

Abstract Algebra for Scala
https://twitter.github.io/algebird
Apache License 2.0
2.29k stars 346 forks source link

implement CuckooFilter (related to BloomFilter) #560

Open sritchie opened 7 years ago

sritchie commented 7 years ago

From this excellent blog post:

If you want to avoid a technical discussion, all you need to know is that for reasonably large sized sets, for the same false positive rate as a corresponding Bloom filter, cuckoo filters use less space than Bloom filters, are faster on lookups (but slower on insertions/to construct), and amazingly also allow deletions of keys (which Bloom filters cannot do).

(Not sure what the etiquette is here if another scala implementation exists. We could try and solicit a PR?)

johnynek commented 7 years ago

Looking at the other implementation, it seems good, but it as is would not mesh well with algebird. It uses a number of special AnyVal classes, which seem fine, but since we have not used that style elsewhere it would be rather inconsistent.

Since it exists, I might just say people should use that if they want. We could send a PR for a module to that library to add an Algebra monoid, which we will be able to use.

If we want a "batteries included approach" here we should probably reimplement. On Tue, Oct 25, 2016 at 06:03 Sam Ritchie notifications@github.com wrote:

From this excellent blog post http://mybiasedcoin.blogspot.com/2014/10/cuckoo-filters.html:

If you want to avoid a technical discussion, all you need to know is that for reasonably large sized sets, for the same false positive rate as a corresponding Bloom filter, cuckoo filters use less space than Bloom filters, are faster on lookups (but slower on insertions/to construct), and amazingly also allow deletions of keys (which Bloom filters cannot do).

The paper: http://www.eecs.harvard.edu/~michaelm/postscripts/cuckoo-conext2014.pdf C++ impl from the author: https://github.com/efficient/cuckoofilter Dense Cuckoo filter in Scala: https://github.com/newhoggy/pico-cuckoo-filter

(Not sure what the etiquette is here if another scala implementation exists. We could try and solicit a PR?)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/twitter/algebird/issues/560, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEJdkwreKbCtR8OevsVFbuMBEo2xjyBks5q3igsgaJpZM4KgLBO .

sritchie commented 7 years ago

+1 on the batteries included approach, especially as we embark on the serialization journey.

vertexclique commented 7 years ago

Any updates on this ?

sritchie commented 7 years ago

Hey Mahmut,

This is currently a placeholder issue for anyone in the community that wants to try their hand at the implementation. I don't think anyone has claimed it yet.

On Monday, February 27, 2017, Mahmut Bulut notifications@github.com wrote:

Any updates on this ?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/twitter/algebird/issues/560#issuecomment-282678058, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEQA0OXEsTb1loAFA3bX55pO38DpGWGks5rgqACgaJpZM4KgLBO .

-- Sam Ritchie, Stripe Inc.

(Too brief? Here's why! http://emailcharter.org)

sritchie commented 7 years ago

Peter, that sounds fantastic! Looking forward to seeing what you come up with.

On Sat, Jul 1, 2017 at 12:36 AM Peter Wang notifications@github.com wrote:

I'm happy to look into this and take this on @sritchie https://github.com/sritchie

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/twitter/algebird/issues/560#issuecomment-312414557, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEQAxnwhYVpxFt8bqmi66NcfXgnRUbyks5sJejogaJpZM4KgLBO .

cesarcolle commented 6 years ago

I figure no update on this issue has been published yet. I have started the development of the cuckoo filter in a monoid way, I base my development on the BloomFilter (I am a beginner with Algebird) .

https://github.com/cesarcolle/algebird/blob/feature-cuckoo-filter/algebird-core/src/main/scala/com/twitter/algebird/CuckooFilter.scala

https://github.com/cesarcolle/algebird/blob/feature-cuckoo-filter/algebird-test/src/test/scala/com/twitter/algebird/CuckooFilterTest.scala