srdja / Collections-C

A library of generic data structures for the C language.
http://srdja.github.io/Collections-C
GNU Lesser General Public License v3.0
2.82k stars 328 forks source link

How about adding Bloom Filter data structure? #162

Open ericbreyer opened 1 year ago

ericbreyer commented 1 year ago

I think Bloom Filters are pretty cool. They are a fairly fast and space-efficient implementation of a set with the trade-off that "False positive matches are possible, but false negatives are not – in other words, a query returns either 'possibly in set' or 'definitely not in set'". Interesting applications could be guarding against expensive searches if the element definitely does not exist or if you only care about an element being outside the set.

I have a very rudimentary implementation of a (counting) bloom filter in cpp here (porting to C is trivial). https://github.com/ericbreyer/redBlackTreeInCpp/blob/master/bloomFilter/bloomFilter.cpp

Wikipedia on bloom filters: https://en.wikipedia.org/wiki/Bloom_filter Wikipedia on counting bloom filters: https://en.wikipedia.org/wiki/Counting_Bloom_filter

Also skip lists could be cool?

srdja commented 1 year ago

@ericbreyer Would be cool for sure! I certainly thought about implementing it (and many others too), but never really got around to actually doing it. But if you feel like doing it, that would be really cool!

ericbreyer commented 1 year ago

Cool, I will work on that! Is there any guide or docs for how the testing should be structured?