yangxu998 / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Trigger cache cleanup on reads of the *iterators* of the asMap view #608

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
When I set expireAfterWrite on a map, the map correctly hides expired values 
from read views after expiration, but does not evict them until a write method 
is called on the key.  This on-write eviction policy does not work if I have a 
read mostly-cache since the writes are so rare that eviction never occurs.  

For example, say I'm implementing a webpage cache where the ID is the MD5 of 
the page.  In this example, the key space is very large and the hit rate is 
fairly low, which results in a memory leak.   The only way to force eviction, 
is to keep a master (separate) copy of the key set around and to periodically 
scan the entier map looking for missing values.  This is very error prone since 
it involves keeping two concurrent collections in sync.

I suggest that additional (optional) eviction policies be added that can check 
for eviction on read and one that use a background thread.

Original issue reported on code.google.com by d...@iq80.com on 18 Apr 2011 at 11:13

GoogleCodeExporter commented 9 years ago
Actually cleanup also occurs once a certain number of consecutive reads have 
occurred.

Original comment by fry@google.com on 19 Apr 2011 at 11:54

GoogleCodeExporter commented 9 years ago
Is the cleanup "once a certain number of consecutive reads have occurred" 
documented anywhere.

Original comment by d...@iq80.com on 19 Apr 2011 at 6:12

GoogleCodeExporter commented 9 years ago
That is an implementation detail, not an API constraint. The API constraint is 
that you can never observe expired entries. As a good citizen MapMaker will 
also ensure that stale entries are cleaned up. It is not obvious to me that 
this needs to be specified, as it seems implicit in what MapMaker is doing. Let 
me know if you have a specific recommendation to the contrary.

Original comment by fry@google.com on 20 Apr 2011 at 1:12

GoogleCodeExporter commented 9 years ago
The problem is that MapMaker chooses a single policy that doesn't work in all 
situations.  For example, say I have a light read and write map that contain 
references to very large objects, where you want to assure that references are 
cleaned up within say 10 minutes.  Another case, say I have a secondary index, 
so when something is expired from the main store, I need to remove it from the 
secondary index.  In that case, I need the listener to fire in a reasonable 
amount of time.  

Original comment by d...@iq80.com on 21 Apr 2011 at 12:08

GoogleCodeExporter commented 9 years ago
Well, if what you want is a more flexible policy then you can look at 
MapMaker.cleanupExecutor. I hid the public setter method since we weren't 
previously ready to deploy it. You can test it out already by adding a 
setCleanupExecutor method to MapMaker to set that field. I wanted to do some 
more internal testing before releasing that publicly, but I'd be glad to expose 
the setter in trunk, with the caveat that things could change as we iterate 
through deeper performance testing.

In any case, let me know if this sounds like it will help you with the issues 
you're seeing.

Original comment by fry@google.com on 21 Apr 2011 at 1:19

GoogleCodeExporter commented 9 years ago
Actually, thinking through this, it is unclear that the current cleanup 
executor will meet your needs. Currently we cleanup after 64 consecutive reads, 
and on every right. Are you saying that you have a use case where a map is used 
so infrequently that it needs to be cleaned up despite not being used? If that 
is the case then the current cleanupExecutor won't help you either...

Original comment by fry@google.com on 21 Apr 2011 at 2:20

GoogleCodeExporter commented 9 years ago
In my main use case, I have a map where I am caching arbitrary values from a 
caller for a fixed period of time.  Some of the keys are update often and some 
keys are written once and never update or read again (like any web cache).  The 
problem is the keys that these write once keys are never evicted.  In a 
slightly related case, I have the same setup, but the map is iterated over 
constantly instead of specific keys to being fetched.  In this case, expired 
values are correctly hidden but never evicted.

Here are some tests I would expect to pass:

    @Test
    public void testExpirationOfNonFetchedKey()
            throws Exception
    {
        final Queue<Entry<String, String>> evictions = new ArrayBlockingQueue<Entry<String, String>>(1000);

        Map<String, String> map = new MapMaker()
                .expireAfterWrite(1, TimeUnit.MILLISECONDS)
                .evictionListener(new MapEvictionListener<String, String>()
                {
                    public void onEviction(@Nullable String key, @Nullable String value)
                    {
                        evictions.add(Maps.immutableEntry(key, value));
                    }
                })
                .makeMap();

        map.put("apple", "apple");
        map.put("banana", "banana");

        Thread.sleep(100);

        int x = 1;
        for (int i = 0; i < 5000; i++) {
            x += hashIt(map.get("apple"));
        }

        Assert.assertTrue(evictions.contains(Maps.immutableEntry("apple", "apple")));
        // Fails here
        Assert.assertTrue(evictions.contains(Maps.immutableEntry("banana", "banana")));

        System.err.println();
        System.err.println("ignore me: " + x);
    }

    @Test
    public void testExpirationWithIteration()
            throws Exception
    {
        final Queue<Entry<String, String>> evictions = new ArrayBlockingQueue<Entry<String, String>>(1000);

        Map<String, String> map = new MapMaker()
                .expireAfterWrite(1, TimeUnit.MILLISECONDS)
                .evictionListener(new MapEvictionListener<String, String>()
                {
                    public void onEviction(@Nullable String key, @Nullable String value)
                    {
                        evictions.add(Maps.immutableEntry(key, value));
                        System.err.printf("EVICT: %s=%s%n", key, value);
                    }
                })
                .makeMap();

        map.put("apple", "apple");
        map.put("banana", "banana");

        Thread.sleep(100);

        int x = 1;
        for (int i = 0; i < 5000; i++) {
            for (Entry<String, String> entry : map.entrySet()) {
                x += hashIt(entry.getKey()) + hashIt(entry.getValue());
            }
        }

        // Fails here
        Assert.assertTrue(evictions.contains(Maps.immutableEntry("apple", "apple")));
        Assert.assertTrue(evictions.contains(Maps.immutableEntry("banana", "banana")));

        System.err.println();
        System.err.println("ignore me: " + x);

        Thread.sleep(10000);
    }

    private int hashIt(String a)
    {
        return a == null ? 0 : a.hashCode();
    }

Original comment by d...@iq80.com on 21 Apr 2011 at 5:56

GoogleCodeExporter commented 9 years ago
The tests you provide aren't quite correct. After the Thread.sleep you need to 
perform some write operation (even a no-op, such as removing or replacing 
something that's not there), or else perform 64 read operations. Once you've 
done that the stale entries should be evicted. If not that's a bug.

The description you provide above sounds to me like it should be working fine. 
As long as any of the keys are updated, or even after sufficient reads, stale 
entries should be evicted. If that really isn't happening then there is indeed 
a bug.

Original comment by fry@google.com on 22 Apr 2011 at 2:49

GoogleCodeExporter commented 9 years ago
Can you confirm if you're seeing this behavior with r09, or with trunk? I just 
found a post-r09 bug that fails to perform cleanup after subsequent cache hits 
to computing maps.

Original comment by fry@google.com on 22 Apr 2011 at 3:33

GoogleCodeExporter commented 9 years ago
Yes I'm using guava-r09.jar

I'm not sure I understand your previous comment.  Each of the test does the 
following:

1. Initial setup with evict 1ms after write
2. Sleep 100ms
3. Perform 5000 reads (either via git or iteration)
4. Verify

What the tests are designed to show is that if I don't perform a read for the 
exact key, the entries are never evicted.

Original comment by d...@iq80.com on 22 Apr 2011 at 3:47

GoogleCodeExporter commented 9 years ago
Ooops, my bad. Let me re-read them...

Original comment by fry@google.com on 22 Apr 2011 at 6:14

GoogleCodeExporter commented 9 years ago
Aha, I figured it out.

The problem with the first test is that cleanup is done per-segment. So you're 
putting entries in different segments. What you really need _in a test 
environment_ is to set concurrencyLevel(1).

The second test I would expect to fail with the current code, which only cleans 
up a segment on a write or on every Nth explicit read. It definitely sounds 
sensible to me to also allow indirect reads to trigger cleanup. I'll look into 
that. Unless you disagree I'll rename this bug to cover this case, which is the 
only new issue that I'm aware of.

Original comment by fry@google.com on 22 Apr 2011 at 6:36

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 13 Jul 2011 at 6:18

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 13 Jul 2011 at 7:41

GoogleCodeExporter commented 9 years ago
The need for this may be reduced now that we'll have an explicit 
Cache.doMaintenance() (not its real name) method, but this could still have 
merit on its own. (?)

Original comment by kevin...@gmail.com on 1 Sep 2011 at 5:37

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 10 Dec 2011 at 4:05

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 16 Feb 2012 at 7:17

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 2 Mar 2012 at 6:56

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 May 2012 at 7:41

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 May 2012 at 7:45

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 22 Jun 2012 at 6:16

GoogleCodeExporter commented 9 years ago
At a minimum, reads of the asMap view should be incrementing the cleanup 
counter. We could do that part right away.

*Probably* we should even trigger cleanup during such a call.  I have minor 
reservations because Map.get() doesn't carry the contract that you might be 
forced to wait while cleanup happens, but this is probably okay.

TBD exactly which Map operations we mean here.

Original comment by kevinb@google.com on 9 Nov 2012 at 9:55

GoogleCodeExporter commented 9 years ago
Actually asMap().get() does already trigger actual cleanup; this bug seems to 
be about reading through one of its iterators.

Original comment by kevinb@google.com on 9 Nov 2012 at 9:59

GoogleCodeExporter commented 9 years ago
This is starting to seem of limited use.  If you have a strong argument for why 
this could save an otherwise bad situation please feel free to present that and 
we can reopen.

Original comment by kevinb@google.com on 9 Nov 2012 at 10:01

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:15

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:09