efficient / cuckoofilter

Other
970 stars 169 forks source link

adding string to cuckoo filter #39

Open Uroojt opened 4 years ago

Uroojt commented 4 years ago

I'm trying to add string to the cuckoofilter library https://github.com/efficient/cuckoofilter/blob/master/src/cuckoofilter.h

mycode

cuckoofilter::CuckooFilter<string, 12> cuckoo_mempool(total_items);

but every time I run the code error will appear on line [https://github.com/efficient/cuckoofilter/blob/master/src/cuckoofilter.h#L68]

error: no match for call to ‘(const cuckoofilter::TwoIndependentMultiplyShift) (const std::__cxx11::basic_string&)’ 68 | const uint64t hash = hasher(item); | ^~~~

siara-cc commented 1 year ago

I faced the same issue and it looks like according to the implementation the first parameter can only be uint64_t. So I used following code:

std::string mystring = "Hello World";
cuckoofilter::CuckooFilter<size_t, 12> test(1000);
test.Add(CityHash64(mystring.c_str(), mystring.length()));

CitiHash64 can be found in src/City.h in https://github.com/aappleby/smhasher

Tom-CaoZH commented 1 year ago

I wonder adding a hash function whether will impact the results(like false positive and etc) @siara-cc

siara-cc commented 1 year ago

Hash function is the basis of such filters and so using a better hash function results in a better false positive rate. A poor hash function will have more "collisions" means the function produces same result for different input values and so will result in more false positives. @Tom-CaoZH