remram44 / python-bloom-filter

Bloom filter for Python
https://pypi.org/project/bloom-filter2/
39 stars 5 forks source link

More documentation for Mmap backend? #6

Closed AADeLucia closed 2 years ago

AADeLucia commented 2 years ago

I am unfamiliar with mmap, but I would love to be able to save/load the Bloom filters. I am unclear on how to load a previously created Bloom filter. Do I have to load it back with mmap or just the BloomFilter with start_fresh=False?

Thanks.

AADeLucia commented 2 years ago

After a quick test I was able to load it back successfully, nevermind! But more detailed documentation about the different backends would be helpful.

remram44 commented 2 years ago

There is no saving or loading with mmap

RoboDonut commented 2 years ago

@AADeLucia what did you do to load it back? start_fresh = False still creating new files :(

AADeLucia commented 2 years ago

This test worked for me:

rom bloom_filter2 import BloomFilter

filename = "bloom_test.bin"

bloom = BloomFilter(
    max_elements=1000,
    error_rate=0.01,
    filename=(filename, -1),
    start_fresh=True
)
print("Created bloom filter ", filename)
# Add elements
bloom.add("does")
print("Added 'does'")
bloom.add("this")
print("Added 'this'")
bloom.add("work")
print("Added 'work'")

# Delete
del bloom
print("Delete filter from memory")

# Reload
bloom2 = BloomFilter(
    max_elements=1000,
    error_rate=0.01,
    filename=(filename, -1),
    start_fresh=False
)
print("Reload filter from ", filename)

# Test
print("'does' should be in bloom: ", "does" in bloom2)
print("'work' should be in bloom: ", "work" in bloom2)
print("'nope' should not be in bloom: ", "not" in bloom2)
AADeLucia commented 2 years ago

Still not sure what

There is no saving or loading with mmap

is about... I suggested better documentation since this is in the README

It still can be pretty useful to save/load to files with the mmap implementation, for example to avoid rebuilding the bloom filter. The mmap functionality also save some memory depending on system settings.

RoboDonut commented 2 years ago

thanks, @AADeLucia !! I guess i just needed to maintain the same max_elements and error_rate params as when it was created. :shrug: :smile: