joseph-fox / python-bloomfilter

Scalable Bloom Filter implemented in Python
MIT License
163 stars 25 forks source link

Reloading from a File does not appear to work in python 3 #7

Open bscain opened 7 years ago

bscain commented 7 years ago

The following sample code has been executed on both python 2 and python 3 with the same result

from pybloom_live import BloomFilter

if __name__ == '__main__':

    bf1 = BloomFilter(capacity=1000, error_rate=0.001)
    bf2 = BloomFilter(capacity=1000, error_rate=0.001)

    bf1.add(1)
    bf1.add(2)

    print("Test 1 in bf1: " + str(1 in bf1))
    print("Test 2 in bf1: " + str(2 in bf1))

    with open("/tmp/bloomTest.txt", "wb") as fp:
        bf1.tofile(fp)

    with open("/tmp/bloomTest.txt", "rb") as fp2:
        bf2.fromfile(fp2)

    print("Test 1 in bf2: " + str(1 in bf2))
    print("Test 2 in bf2: " + str(2 in bf2))

Output: Test 1 in bf1: True Test 2 in bf1: True Test 1 in bf2: False Test 2 in bf2: False

bscain commented 7 years ago

Turns out I was using it incorrect, recommend adding samples on how to use the to / from file

ghost commented 7 years ago

I will take a look at it as soon as I can. Thanks @bscain

evanfoster commented 6 years ago

@bscain would you be able to comment with a very short example of how to use this feature? I'd just read the source but I'm away from my computer. No worries if you aren't able to, I'll do it once I can.

paularmand commented 6 years ago

This should work:

from io import BytesIO
bytesio = BytesIO()

# sbf is your scalable bloom filter
sbf.tofile(bytesio)

# reset the stream handle to the start
bytesio.seek(0)
# you can check it with: print(bytesio.getvalue().hex())

sbf_read = pybloom_live.ScalableBloomFilter.fromfile(bytesio)
bytesio.close()
Huangvivi commented 3 years ago

The following sample code has been executed on both python 2 and python 3 with the same result

from pybloom_live import BloomFilter

if __name__ == '__main__':

    bf1 = BloomFilter(capacity=1000, error_rate=0.001)
    bf2 = BloomFilter(capacity=1000, error_rate=0.001)

    bf1.add(1)
    bf1.add(2)

    print("Test 1 in bf1: " + str(1 in bf1))
    print("Test 2 in bf1: " + str(2 in bf1))

    with open("/tmp/bloomTest.txt", "wb") as fp:
        bf1.tofile(fp)

    with open("/tmp/bloomTest.txt", "rb") as fp2:
        bf2.fromfile(fp2)

    print("Test 1 in bf2: " + str(1 in bf2))
    print("Test 2 in bf2: " + str(2 in bf2))

Output: Test 1 in bf1: True Test 2 in bf1: True Test 1 in bf2: False Test 2 in bf2: False

Hey guy, I met this problem at a time, then I found the method fromfile is a classmethod. It means you don't need to initial bf2 with args, just

with open("/tmp/bloomTest.txt", "rb") as fp2:
    bf2 = BloomFilter.fromfile(fp2)

It works for me