Open ScottShao opened 8 years ago
It would be helpful if you provided links to specific lines in a specific commit of branch master
.
https://github.com/jaybaird/python-bloomfilter/blob/master/pybloom/pybloom.py
in line 365 and line 372, when you add a new filter into scalable filters, it seems you are using different error rate, for the first filter, the error rate is error_rate * (1 - ratio), while it's error_rate * ratio for the rest of filters.
Thanks for the additional information. When master
branch changes, the above link will point to different lines. The reference can be made persistent with:
https://github.com/jaybaird/python-bloomfilter/blob/70e25c653ab87fbc2273328e89544d4124f52065/pybloom/pybloom.py#L365
and:
https://github.com/jaybaird/python-bloomfilter/blob/70e25c653ab87fbc2273328e89544d4124f52065/pybloom/pybloom.py#L372
that can also be written as line 365 and line 372.
Please note that I am not the author of pybloom
.
Thank you for the tips.
@ScottShao I think this version https://pypi.python.org/pypi/pybloom_live/2.1.0 addresses your concerns.
@joseph-fox Thank you for sharing the information. It seems that the only place that we use ratio is when we create a new filter, and we use ratio * error_rate as the error rate of new filter. If this is the case, I don't think we need the parameter ratio any more.
Hi, I'm wondering why you are using ratio in ScalableBloomFilter, and it seems that the first filter has different error_as from the rest filters. Because in the code, the first filter has error rate as error_rate * (1 - ratio), and the rest of filters have error rate as error_rate * ratio.