barrust / pyprobables

Probabilistic data structures in python http://pyprobables.readthedocs.io/en/latest/index.html
MIT License
111 stars 10 forks source link

Add a function to insert raw `bytes` into a filter. #74

Closed KOLANICH closed 2 years ago

barrust commented 2 years ago

I believe that this is possible with only putting in a different hashing function but to verify, could you provide a simple example of what you would like to do? Thanks!

KOLANICH commented 2 years ago

I mean that currently keys are strings, but I need byte-strings. It is far more versatile (strings can always be converted to bytes with low overhead).

I have created (it is not yet finished and should not be used, large changes will likely happen) a tool https://github.com/KOLANICH-tools/DilatedHash.py that uses Bloom/Cuckoo filters on raw bytes.

barrust commented 2 years ago

I will have to do more testing on if a str object is required. In the code, I don't see a place where it is enforced but it has been awhile since I added a lot of functionality so I could be wrong.

barrust commented 2 years ago

just a quick test that resulted in a type error:

from probables import BloomFilter
b = BloomFilter(est_elements=1000, false_positive_rate=0.1)
b.add(b'this is a test')
KOLANICH commented 2 years ago

Thank you.