Baqend / Orestes-Bloomfilter

Library of different Bloom filters in Java with optional Redis-backing, counting and many hashing options.
Other
839 stars 245 forks source link

ArrayOutOfBounds exception on load testing #12

Closed turf00 closed 10 years ago

turf00 commented 10 years ago

During load testing of the Counting Bloom Filter inside a Spring MVC web application running in Tomcat, we see an ArrayIndexOutOfBoundsException as follows:

java.lang.ArrayIndexOutOfBoundsException
    org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:948)
    org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:827)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
    org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:812)
    javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
root cause: java.lang.ArrayIndexOutOfBoundsException
    sun.security.provider.DigestBase.engineUpdate(DigestBase.java:127)
    sun.security.provider.MD5.implDigest(MD5.java:105)
    sun.security.provider.DigestBase.engineDigest(DigestBase.java:186)
    sun.security.provider.DigestBase.engineDigest(DigestBase.java:165)
    java.security.MessageDigest$Delegate.engineDigest(MessageDigest.java:576)
    java.security.MessageDigest.digest(MessageDigest.java:353)
    java.security.MessageDigest.digest(MessageDigest.java:399)
    orestes.bloomfilter.BloomFilter.hashCrypt(BloomFilter.java:619)
    orestes.bloomfilter.BloomFilter.hash(BloomFilter.java:679)
    orestes.bloomfilter.BloomFilter.contains(BloomFilter.java:235)
    orestes.bloomfilter.BloomFilter.contains(BloomFilter.java:242)

Our counting bloom filter is accessed concurrently to test for membership of the set and is created as follows:

bloomFilter = new CBloomFilter<String>(40_000_000, 0.01, 4);

Prior to starting the test we have preloaded 17 million elements.

This is with Java 7 release 40.

DivineTraube commented 10 years ago

Hi, thanks for sharing this. This is almost certainly caused by concurrent access to the Bloom filter (not thread-safe!). Internally it uses a java MessageDigest object, which is not thread-safe and causes the error you encountered. As for now, you can use the Bloomfilters class to wrap the Bloom filter in a thread-safe Decorator (similar to Collections.synchronizedXYZ). We are currently doing a major rewrite where all Bloom filters will be inherently thread-safe and thus the above issue will not occur. We plan to release our new version of the Bloom filters next week.