seiflotfy / cuckoofilter

Cuckoo Filter: Practically Better Than Bloom
MIT License
1.13k stars 109 forks source link

Algorithm improvements #15

Closed chessman closed 6 years ago

chessman commented 6 years ago

I have a test that checks a space utilization:

    cf := cuckoofilter.NewCuckooFilter(cap)
    for i := uint(0); i < cap; i++ {
        ok := cf.Insert(randomBytes())
        if !ok {
            fmt.Printf("Error on %v/%v\n", i, cap)
        }
    }

With unmodified version of the cuckoofilter I have an error on 64073/1048576. It means that reinsert failed after 6% entries occupied.

With this PR I have no errors up to 96% space utilized. I believe that the problem is the similar hashing algorithms of fingerprint and i1. Your version is trying to reinsert fp to the same bucket and after maxCuckooCount it fails.

Also I removed the unnecessary lock in getFingerprint because it affected the use of multiple instances of the filter and didn't make the filter thread-safe. I think that lock should be in a filter object and lock/unlock should protect each public method.

seiflotfy commented 6 years ago

thanks so so much