deceze / BloomFilter

A binary based, very space efficient Bloom filter implementation for PHP.
28 stars 7 forks source link

Possible to change false positive rate? #2

Open LeMoussel opened 8 years ago

LeMoussel commented 8 years ago

With this Test, I got 115 false positive (t=95115, => false positive rate = 0.115 %) Is it possible to change the false positive rate, eg set to 0.05%?

$sample = 100000;
$offset = 5000;

require_once 'BloomFilter/BloomFilter.php';

// 1Mo ram = 800000 for 100000 values
$s1 = microtime(true);
$decezeBloomFilter = BloomFilter::constructForTypicalSize($sample*8, $sample);
$e1 = microtime(true);

$s2 = microtime(true);
for ($i = 0; $i < $sample; $i++) $decezeBloomFilter->add($i);
$e2 = microtime(true);

$t = 0;
$s3 = microtime(true);
for ($i = $offset; $i < $sample + $offset; $i++) $t += $decezeBloomFilter->maybeInSet($i);
$e3 = microtime(true);

echo 'deceze/BloomFilter Create Time: '.($e1 - $s1).PHP_EOL;
echo 'deceze/BloomFilter Add Time: '.($e2 - $s2).' ('.floor($sample/($e2-$s2)).' i/s)'.PHP_EOL;
echo 'deceze/BloomFilter Check Time: '.($e3 - $s3).' ('.floor($sample/($e3-$s3)).' i/s)'.PHP_EOL;
echo $t;
deceze commented 8 years ago

Quite honestly I have no answer to this; it's been several years since I've looked into Bloom filters at all, and I never got that invested in them in the first place. You should be able to affect the false positive rate by creating a bigger initial array. The question is whether this implementation is somehow suboptimal and exhibits a higher than expected false positive rate. And this I have no idea about.