lhfei / hibernate-memcached

Automatically exported from code.google.com/p/hibernate-memcached
0 stars 0 forks source link

hashed key strategies vulnerable to collisions (especially for query cache) #22

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
We've been using this library to enable memcached for the hibernate second 
level cache and query cache.  After some careful debugging, we found that the 
default HashCodeKeyStrategy was generating identical memcache keys for 
different query parameters.

For example, one query had parameters 60877 and "1.1.1" while another query had 
35891 and "2.2.1".  The HashCodeKeyStrategy uses the 4-byte instance hashCode 
for the given key, which in this case was the org.hibernate.cache.QueryKey.  
That implementation uses the common pattern of building a hash by taking the 
multiplying the current partial result by a prime (37) and adding the next 
hashCode.  This results in a collision in this case (and many others we found). 
 i.e. Long.valueOf(60877).hashCode() * 37 + "1.1.1".hashCode() == 
Long.valueOf(35891).hashCode() * 37 + "2.2.1".hashCode()

This 4-byte hashCode is fine in a situation like java.util.HashMap where an 
equals() check is performed for any keys with matching hash code, but this 
library does not store the original Key and is unable to perform such a check 
to detect a collision, so any collision results in the wrong value being 
returned.

The existing Md5KeyStrategy and Sha1KeyStrategy suffer from the same common 
collision problem because they take their MD5 of SHA1 hash after the 4-byte 
hashCode is computed, due to the fact that they inherit 
HashCodeKeyStrategy.transformKeyObject.

The StringKeyStrategy also doesn't work because the query strings are too long, 
have the parameters truncated and cause more collisions.

Thus none of the existing key strategies are suitable to be used for the 
hibernate query cache, unless your application can tolerate sometimes getting 
the wrong result for a hibernate query.

A key strategy that takes a longer hash (MD5, SHA) of the string version of the 
key should be sufficient.  We're currently evaluating whether to simply disable 
the query cache entirely ( See 
http://tech.puredanger.com/2009/07/10/hibernate-query-cache/ ).  If we decide 
to continue using it, I will probably submit a patch with such a strategy.

Original issue reported on code.google.com by ddlat...@gmail.com on 29 Mar 2011 at 11:50

GoogleCodeExporter commented 8 years ago
Subclassing the HashCodeKeyStrategy was a terrible idea in retrospect.
Either of the digest strategies would solve this problem if they were using the 
StringKeyStrategy.

I think a proper solution would be to have the digest based strategies extend 
the AbstractKeyStrategy and delegate to HashCodeKeyStrategy with a 
configuration option to have it delegate to StringKeyStrategy instead.

I'd gladly accept a patch implementing such a change :)

Original comment by raykrue...@gmail.com on 30 Mar 2011 at 12:31

GoogleCodeExporter commented 8 years ago
On a whim I dove in and tore up the key strategies. Essentially Sha1KeyStrategy 
is now the default, and the recommended, key strategy. It uses the toString() 
and hashCode() values together. Essentially the key is sha1(region + clearIndex 
+ key.toString() + key.hashCode()). Hibernate already implements toString() on 
all of it's key objects in order to support OsCache and others.

I've deployed a 1.3-SNAPSHOT jar out to the repo. Pull it down and give it a 
shot.
You can manually grab it from 
http://raykrueger.googlecode.com/svn/repository/com/googlecode/hibernate-memcach
ed/1.3-SNAPSHOT/

Original comment by raykrue...@gmail.com on 6 Apr 2011 at 4:58

GoogleCodeExporter commented 8 years ago
Looks good, I will give it a try.

I noticed that MemcachedCache.keyStrategy is still initialized to 
HashCodeKeyStrategy, but then configured to use Sha1KeyStrategy.

Original comment by ddlat...@gmail.com on 7 Apr 2011 at 10:24

GoogleCodeExporter commented 8 years ago
Can you point me to that? Not sure what you mean.

Original comment by raykrue...@gmail.com on 7 Apr 2011 at 10:25

GoogleCodeExporter commented 8 years ago
Err nevermind, I see it.
Thanks

Original comment by raykrue...@gmail.com on 7 Apr 2011 at 10:30

GoogleCodeExporter commented 8 years ago
Ok Fixed that up. Thanks for catching it.

I'm going out drinking now :)

Original comment by raykrue...@gmail.com on 7 Apr 2011 at 11:11

GoogleCodeExporter commented 8 years ago
This issue can now be closed as it's fixed in 1.3

Original comment by ddlat...@gmail.com on 18 Aug 2011 at 4:29

GoogleCodeExporter commented 8 years ago
heh great! Thanks for circling back, I guess I should put out an official 1.3 
release then.

Original comment by raykrue...@gmail.com on 18 Aug 2011 at 4:32