Open Glench opened 12 years ago
Great idea! I originally didn't include n and p as I was inspired by https://github.com/willf/bloom, which includes an empirical false positive estimator rather than relying on theory. I'm not entirely sure why he does this, but it's probably because FNK isn't as good as a theoretically perfect hash function (although it is very fast and hence "good enough" for bloom filters).
I plan on making the hash function configurable anyway, so I'll add support for this at the same time.
The other idea I had was the ability to set a fixed false positive rate, and have the bit array automatically grow (perhaps by doubling each time) to keep the rate below a fixed value.
Ah, I see. Excellent!
Apparently this guy implemented the latter idea you mentioned. I haven't looked at his implementation, but maybe you can take inspiration/code from there :)
I implemented such an approach on a fork of mine : https://github.com/Phyks/bloomfilter.js
This uses capacity
and error_rate
instead of m
and k
as in the current implementation.
@jasondavies It might not be well fitted for a direct PR, and I'm not sure about my code, but it does the job AFAIK, and passes my tests successfully. Thanks for writing the original code !
I was just wondering why you don't have an option to use the optimal m and k parameters (see here) based on n and p.
There are a couple ways I imagine this going:
true
would interpret the first and second values of the constructor as n and p instead of m and k.I could implement any of these, but I wanted to ask you first if there was any reason why you didn't do this.