axiak / pybloomfiltermmap

Fast Python Bloom Filter using Mmap
http://axiak.github.com/pybloomfiltermmap/
MIT License
741 stars 137 forks source link

error with large dataset #41

Closed joe42 closed 10 years ago

joe42 commented 10 years ago

Hi,

I would like to use the bloom filter with very large data sets. I will have 1000000000000000000000000000000 entries. The entries themselves are numbers up to the number of entries. With pybloomfiltermmap I get OverflowError or MemoryError if I use less entries. Could you suggest a solution?

Thanks, joe42

axiak commented 10 years ago

Hi Joe,

I would rethink your strategy. 10^31 is an incredible number and not one for which you can use a data structure like bloom filter to count.

On Wed, Dec 11, 2013 at 7:40 AM, joe42 notifications@github.com wrote:

Hi,

I would like to use the bloom filter with very large data sets. I will have 1000000000000000000000000000000 entries. With pybloomfiltermmap I get OverflowError or MemoryError if I use less entries. Could you suggest a solution?

Thanks, joe42

— Reply to this email directly or view it on GitHubhttps://github.com/axiak/pybloomfiltermmap/issues/41 .

andresriancho commented 10 years ago

Maybe this one should be closed.