Baqend / Orestes-Bloomfilter

Library of different Bloom filters in Java with optional Redis-backing, counting and many hashing options.
Other
843 stars 245 forks source link

In "BloomFilter" interface, "union" and "intersect" methods returns "false" #25

Closed rogerdielrton closed 8 years ago

rogerdielrton commented 8 years ago

Hi. Congratulations for your Bloom Filter library: is very useful and complete. Come on with my problem: The execution of my test code (see below "MyTest" Java class) fails because "union" and "intersection" methods returns "false": Interface orestes.bloomfilter.BloomFilter:

Why both returns "false"? What's the meaning of "bloom filter could SUCCESSFULLY be updated"? How to avoid tha "union" and "intersection" works well without returning "false"? Thanks in advance.

public class MyTest {

    public static void main(String[] args) {
        BloomFilter<Long> alpha = new FilterBuilder()
                .expectedElements(50)
                .falsePositiveProbability(0.05)
                .hashFunction(HashMethod.Murmur3)
                .buildBloomFilter();
        fill(1, 50, alpha);     
        BloomFilter<Long> omega = new FilterBuilder()
                .expectedElements(62500)
                .falsePositiveProbability(0.05)
                .hashFunction(HashMethod.Murmur3)
                .buildBloomFilter();
        fill(2, 62500, omega);      
        try {
            System.out.println("Intersection estimated population: " + intersection(alpha, omega).getEstimatedPopulation());
        }
        catch (UnsupportedOperationException uoe) {
            uoe.printStackTrace();
        }
        try {           
            System.out.println("Union estimated population: " + union(alpha, omega).getEstimatedPopulation());
        }
        catch (UnsupportedOperationException uoe) {
            uoe.printStackTrace();
        }
    }

    static BloomFilter<Long> union(BloomFilter<Long> a, BloomFilter<Long> b) {
        BloomFilter<Long> c = a.clone();
        boolean success = c.union(b);
        if (!success) throw new UnsupportedOperationException();
        return c;
    }

    static BloomFilter<Long> intersection(BloomFilter<Long> a, BloomFilter<Long> b) {
        BloomFilter<Long> c = a.clone();
        boolean success = c.intersect(b);
        if (!success) throw new UnsupportedOperationException();
        return c;
    }

    static void fill(int seed, int max, BloomFilter<Long> bf) {
        Random r = new Random(seed); 
        LongStream.rangeClosed(1, max).map(i -> r.nextLong()).filter(i -> i > 0).forEach(i -> bf.add(i));
    }

}
rogerdielrton commented 8 years ago

I found the cause:

FilterBuilder.isCompatibleTo:
    return this.size() == other.size() && this.hashes() == other.hashes() && this.hashMethod() == other.hashMethod();

According to that, because "alpha" and "omega" match in "hashes" and "hash method", but differ on "size", then "intersect" and "union" returns "false".