DaveAKing / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

Cache: complex keys causing very low hit rate and large number of evictions #1666

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
We have a complex key consisting of a POJO with 4 string fields, an Integer and 
a DateMidnight (from Joda Time). 2 out of the 4 String fields can be null at 
any time.

It seems with a simple key (e.g. a single String) the cache is very fast,

However, with a complex key like this the hit rate and number of evictions goes 
up dramatically, e.g.:

Hit rate: 33.30 %
Miss rate: 66.69 %
Average load penalty: 5.44 ms
Eviction count: 15002
Request count: 24402
Load exception rate: 0.00 %

if we convert all these keys into a single String using a StringBuilder is 
starts being super fast again with a hit rate of ~99.9%.

The problem with that is in the loading method we have to parse the String to 
break out the original 6 pieces that made it. Not preferable for an application 
with near real-time SLAS.

Set up:

cache = CacheBuilder.newBuilder()
.maximumSize(100_000)
.expireAfterWrite(5, TimeUnit.MINUTES)
.concurrencyLevel(100)
.recordStats()
.build(
new CacheLoader<ItineraryResultCacheKey, ItineraryResult>() {
                            @Override
                            public ItineraryResult load(ItineraryResultCacheKey key) throws Exception {

.... cache loading code....

                        });
    }

On OpenJDK 7. Tested this with both 14.01 and 16.01, same results.

I've tried multiple things (using Date with all hours/minutes/seconds set to 0 
to ensure key equality), converting all values to a single string and coding 
the equals() and hashCode() to use that instead of the other fields.

To no avail.

It seems that if you put in a complex POJO as a key, something goes wrong in 
the Cache internals and we consistently get a hit rate of ~33%, never more. 
Very weird that it is always so consistent.

I admit, it is a bizarre issue. Any suggestions on what we may be doing wrong 
are welcome.

Original issue reported on code.google.com by jace...@gmail.com on 11 Feb 2014 at 5:13

GoogleCodeExporter commented 9 years ago
I've also played around with the concurrencyLevel, anywhere from 1 to exact 
number of threads in the HTTP server. Made no discernible difference either.

Original comment by jace...@gmail.com on 11 Feb 2014 at 5:15

GoogleCodeExporter commented 9 years ago
Curious what your equals/hashCode methods look like on your 
ItineraryResultCacheKey POJO...

Original comment by kak@google.com on 11 Feb 2014 at 5:18

GoogleCodeExporter commented 9 years ago
At first I had a default one generated via Lombok. That is how we found the 
original issues.

Since that was my first suspicion I overrode it manually to just use the 
aforementioned single String representation of all the parts of the POJO, e.g.

    private String id;

    @Override
    public boolean equals(Object obj) {
        if (obj instanceof ItineraryResultCacheKey) {
            return this.id.equals(((ItineraryResultCacheKey) obj).getId());
        } else {
            return false;
        }
    }

    @Override
    public int hashCode() {
        return this.id.hashCode();
    }

where "id" gets created in the Constructor by appending all the 6 fields into a 
single string using a StringBuilder (with "" put if string is null and the 
explicit year/month/day integers put into it for the date part (so no actual 
Date.toString() where we could get millsecond/second differences).,

Also, we found this in simple perf testing where we are basically sending the 
same 10 requests over and over (using multi-mechanize) so the range of keys 
being sent is actually very small.  It's not real-life data.

Is there some part of code you could recommend for me to trace or debug 
manually (a particular Class/method)?

Unfortunately due to the proprietary nature of the application I am very 
limited in what type of code I can post.

Original comment by jace...@gmail.com on 11 Feb 2014 at 5:30

GoogleCodeExporter commented 9 years ago
I hang my head in shame. Equals()/hashCode() are not the problem.

We had previously existing code in a totally different section  of the app that 
was actually mutating this key during multiple parts of the processing (one of 
the fields). This was done as part of a naive attempt to reduce GC pressure by 
reusing object instances (instead of instantiating them multiple times with 
minor variations).

Once I removed that and made it properly instantiate a new instance every time 
every started running properly.

Please close as invalid.

Original comment by jace...@gmail.com on 11 Feb 2014 at 5:56

GoogleCodeExporter commented 9 years ago

Original comment by lowas...@google.com on 11 Feb 2014 at 5:57

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<issue id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:10

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:07