Closed lemire closed 4 years ago
The Bloom filter implementation is buggy. For large inputs, there are false positives.
#include <sstream> #include "bloom.h" int main(int argc, char **argv) { size_t array_size = 500 * 1000 * 1000; uint64_t *array = new uint64_t[array_size]; for (size_t i = 0; i < array_size; i++) array[i] = i; using Table = bloomfilter::BloomFilter<uint64_t, 12, false, SimpleMixSplit>; Table table(array_size); table.AddAll(array, 0, array_size); printf("Checking for false negatives\n"); for (size_t i = 0; i < array_size; i++) { if (table.Contain(array[i]) != bloomfilter::Ok) { printf("Detected a false negative. You probably have a bug. Aborting.\n"); return EXIT_FAILURE; } } printf("Verified with success: no false negatives\n"); }
The issue was an overflow in AddAll.
I am not adding a check because checking for such an overflow is quite expensive.
The Bloom filter implementation is buggy. For large inputs, there are false positives.