DaveAKing / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Bloom filter high memory consumption #1721

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi,

I'm using the guava bloom filter for a RAM constrained project.
The memory usage of the bloom filter itself seems efficient (less than 2MB for 
1 million elements, fpp = 0.01).

However, it seems to allocate a lot of temporary memory (around 250MB for 1 
million members) which causes the GC to kick in. After the GC kicks in, it is 
all reclaimed.

Are these temporary allocations necessary? Is it possible to optimize this? 

I tried the longFunnel and the byteArrayFunnel and achieved similar results.

Thank you,
--Pedro

Original issue reported on code.google.com by pedro_og...@yahoo.com on 10 Apr 2014 at 9:01

GoogleCodeExporter commented 9 years ago
A code snippet would be useful. Can you see this behavior even with a simple 
test case? (e.g., make a BloomFilter<Long> and insert 1 million elements)?

Original comment by kak@google.com on 10 Apr 2014 at 9:11

GoogleCodeExporter commented 9 years ago
FWIW, I think the allocation has to be coming from the Hasher allocation.  It 
might be possible to avoid the allocations, but I'm not sure how it could be 
done.

Original comment by lowas...@google.com on 10 Apr 2014 at 9:14

GoogleCodeExporter commented 9 years ago
Here's the repro:

public static void main(String []args)
{
   BloomFilter<Long> bf = BloomFilter.create(Funnels.longFunnel(), 1000000, 0.01);
   for (int i = 0; i < 1000000000; i++)
   {
      bf.put(new Long(i));

      // Log every 100,000 insertions
      if (i % 100000 == 0)
      {
         // Force garbage collection
         /*
         for (int j = 0; j < 10; j++)
         {
           System.gc();
         }
         */

         System.out.println(i + ":\t" + (1.0 * (Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()) / (1024 * 1024)));

      }
   }
}

Original comment by pedro_og...@yahoo.com on 11 Apr 2014 at 12:17

GoogleCodeExporter commented 9 years ago
Unfortunately I'm not sure there's a whole lot we can do to optimize away the 
allocations.

Since it's not a memory leak (just a temporary hike in memory), I'm going to 
close this as the GC eventually does the right thing.

Original comment by kak@google.com on 22 Apr 2014 at 7:06

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:09

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:07