Callidon / bloom-filters

JS implementation of probabilistic data structures: Bloom Filter (and its derived), HyperLogLog, Count-Min Sketch, Top-K and MinHash
https://callidon.github.io/bloom-filters/
MIT License
360 stars 41 forks source link

import / export BloomFilter not working as expected #11

Closed imcotton closed 4 years ago

imcotton commented 4 years ago

Hi, these functionalities seemed have issue:

https://github.com/Callidon/bloom-filters/blob/9dec78ea58f4651bfd5e05aa6b83df8ecad18a5a/src/bloom-filter.js#L79-L85

https://github.com/Callidon/bloom-filters/blob/9dec78ea58f4651bfd5e05aa6b83df8ecad18a5a/src/export-import-specs.js#L60-L74

https://github.com/Callidon/bloom-filters/blob/9dec78ea58f4651bfd5e05aa6b83df8ecad18a5a/src/bloom-filter.js#L56-L62

folkvir commented 4 years ago

Thank you for sharing this. I will fix this tomorrow.

folkvir commented 4 years ago

@imcotton is it the last available version btw? or an older one?

imcotton commented 4 years ago

The latest version from the npm (0.8.0).

folkvir commented 4 years ago

The proper way to create a bloom filter is using the current constructor if you want to customize the number of hash functions used or the size of the filter. But if you dont, use the static .create() function. So I deleted the _errorRate property. The rate is computed/choosed when you construct the filter. Even if you export and import the filter you should have the same error rate between those 2 instances. If you want to add this property to your export json value call the .rate() method then add it to your object.

I am able to reproduce the issue only by using a serialization step before importing the structure. Otherwise it works correctly.

let exported = filter.saveAsJSON()
// simulate serialization
exported = JSON.stringify(exported)
// simulate deserialization
exported = JSON.parse(exported)
const newFilter = BloomFilter.fromJSON(exported)

Tempory Workaround: Just add _errorRate before importing the filter if you dont want to bump to the new version.

Fix: I removed the _errorRate from the export/import spec. It should work correctly now.

folkvir commented 4 years ago

@Callidon Can you bump the version? Fix is pushed in the lasts commit. I bump the version to the 0.8.1.

Callidon commented 4 years ago

The new version 0.8.1 is tagged and available on npm. Thanks @folkvir for the fix!