bdupras / guava

Google Core Libraries for Java 6+
Apache License 2.0
0 stars 1 forks source link

Calculations for optimalEntriesPerBucket and optimalLoadFactor #1

Open bdupras opened 8 years ago

bdupras commented 8 years ago

The current methods for determining bucket size (number of entries per bucket) and load factor are based on prose in section 5.1 of Cuckoo Filter: Practically Better Than Bloom.

It would be better to have true calculations instead of the current conditional logic with hard-coded values.

Heya @apc999 - do you have any recommendations here?

// cc @beala

bdupras commented 8 years ago

Oh - and by the way, this is work in progress. This version of the CuckooFilter doesn't yet function. When I get it functional (soon), I'll probably break it out from the guava fork into its own repo.

apc999 commented 8 years ago

@bdupras , my past experience is 4 cells per bucket ensures pretty good load factor --- which is 95%+ with 500 retries on insert; Using 8 cells per bucket will get even higher load factor (e.g., close to 99%) with even less retires, but at the cost of doubling the false positive rate. I would say 4 or 8 is a good value to use in practice