prashnts / pybloomfiltermmap3

Fast Python Bloom Filter using Mmap
https://github.com/prashnts/pybloomfiltermmap3
MIT License
126 stars 22 forks source link

Add bit_array property to inspect Bloom filter bit data #21

Closed mizvyt closed 4 years ago

mizvyt commented 4 years ago

The bit_array property returns an integer that represents the Bloom filter data in its purest form. Added unit tests also illustrating some of the use possibilities.

Needless to say, you won't actually be able to make assumptions about how things are hashed just from the bit array. If you create a Bloom filter from the same list of hash seeds, you will be able to ensure that the two have the same bit data. But if you want to make sure it was hashed correctly (possible test use case?), you'd need to manually perform hashing outside of the library, flip the bits in some bit array, and then compare it to the outcome of bf.add(something). All in all this feature has limited uses, but I believe it may be good, esp. as @AaronRanAn mentioned, for educational purposes and debugging.

Ref. https://github.com/prashnts/pybloomfiltermmap3/issues/19