Queries with projections generate different cache keys in different JVMs

GoogleCodeExporter commented 8 years ago

If you create a cacheable query that uses a projection (i.e. return only the 
id, not the whole entity, for example), then the memcached key generated for 
that query will not necessarily be the same each time it is run.  The hibernate 
QueryKey will include a non-null custom transformer.  Any KeyStrategy's which 
use toString (everything but HashCodeKeyStrategy) will invoke QueryKey.toString 
which will call the result transformrer's toString, which is the default 
Object.toString() method that includes the instance identifier.  This will be 
different for different JVMs, so multiple processes in a cluster will not share 
the cache entry, nor will the same process if it is restarted.

Unfortunately, I don't see a simple fix, only doing some digging in with 
introspection or hacky manipulation of the result of toString().  Anyone have a 
better idea?

Original issue reported on code.google.com by ddlat...@gmail.com on 18 Aug 2011 at 4:42

GoogleCodeExporter commented 8 years ago

Ok, here's a solution using the Serializable property of QueryKey instead of 
toString 

public class Sha1SerializableKeyStrategy extends Sha1KeyStrategy
{

    @Override
    protected String transformKeyObject(Object key)
    {
        try
        {
            ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
            ObjectOutputStream out = new ObjectOutputStream(byteStream);
            out.writeObject(key);
            return StringUtils.toHexString(byteStream.toByteArray());
        }
        catch (IOException e)
        {
            throw new RuntimeException(e);
        }
    }

}

NOTE - this code is just an illustration, it's probably better not to transform 
the bytes to a string, concatenate them, then change them back to bytes again.

Original comment by ddlat...@gmail.com on 18 Aug 2011 at 5:11

GoogleCodeExporter commented 8 years ago

It's been a while since I've done anything with query caching (or Hibernate in 
general), but can't you provide a name for the cached query? I believe if you 
do that it should solve the problem.

Original comment by raykrue...@gmail.com on 19 Aug 2011 at 3:22

GoogleCodeExporter commented 8 years ago

Closing this down after receiving no reply. I'm assuming naming the query fixed 
it.

Original comment by raykrue...@gmail.com on 17 Jul 2013 at 12:35

Changed state: WontFix

GoogleCodeExporter commented 8 years ago

Hi Ray,

Sorry for not responding to the question earlier - not sure how I missed it.

Naming the query will not help in this case.  
DigestKeyStrategy.transformKeyObject invokes return key.toString() + ":" + 
key.hashCode();

QueryKey.toString is as follows:

    public String toString() {
        StringBuffer buf = new StringBuffer()
                .append( "sql: " )
                .append( sqlQueryString );
        if ( positionalParameterValues != null ) {
            buf.append( "; parameters: " );
            for ( int i = 0; i < positionalParameterValues.length; i++ ) {
                buf.append( positionalParameterValues[i] ).append( ", " );
            }
        }
        if ( namedParameters != null ) {
            buf.append( "; named parameters: " ).append( namedParameters );
        }
        if ( filterKeys != null ) {
            buf.append( "; filterKeys: " ).append( filterKeys );
        }
        if ( firstRow != null ) {
            buf.append( "; first row: " ).append( firstRow );
        }
        if ( maxRows != null ) {
            buf.append( "; max rows: " ).append( maxRows );
        }
        if ( customTransformer != null ) {
            buf.append( "; transformer: " ).append( customTransformer );
        }
        return buf.toString();
    }

The customTransformer typically does not override Object.toString so the JVM 
prints the classname and memory address which varies from JVM to JVM.  A 
similar problem occurs with the hash code.

As a result every process in the cluster caches the query separately.

Consequently we're currently using this key strategy instead

/**
 * KeyStrategy that uses java Serialization so that queries using projections can share the 
 * cache across JVMs.
 * See
 * http://code.google.com/p/hibernate-memcached/issues/detail?id=23
 * 
 * Enable via <property name="hibernate.memcached.keyStrategy">[yourpackagehere].Sha1SerializableKeyStrategy</property>
 * 
 * WARNING - not all classes used are currently Serializable
 */
public class Sha1SerializableKeyStrategy implements KeyStrategy
{

    @Override
    public String toKey(String regionName, long clearIndex, Object key)
    {
        MessageDigest digest = HashType.SHA1.get();
        digest.reset();
        digest.update(regionName.getBytes(Charsets.UTF_8));
        digest.update(Longs.toByteArray(clearIndex));
        digest.update(getSerializableBytes(key));
        return ByteArrayUtil.toHexString(digest.digest());
    }

    private static byte[] getSerializableBytes(Object o)
    {
        try
        {
            ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
            ObjectOutputStream out = new ObjectOutputStream(byteStream);
            out.writeObject(o);
            out.close();
            return byteStream.toByteArray();
        }
        catch (IOException e)
        {
            throw new RuntimeException(e);
        }
    }

}

Original comment by ddlat...@gmail.com on 17 Jul 2013 at 3:46

GoogleCodeExporter commented 8 years ago

Yeah this came up on the list a long time ago and there was some sort of 
solution I thought. It's been a really long time though. Hibernate never really 
left very good options there.

Have you looked at how QueryKey.hashcode() is implemented? Maybe that's 
actually usable?

P.S. I have to rely on reports from the field, haven't touched Java or 
Hibernate in 4 years, so thanks for following up on this.

Original comment by raykrue...@gmail.com on 17 Jul 2013 at 3:54

Changed state: New

GoogleCodeExporter commented 8 years ago

QueryKey.hashCode() has the same problem as it calls 
customTransformer.hashCode() which also does not override Object.hashCode() and 
so varies from JVM to JVM.

Original comment by ddlat...@gmail.com on 17 Jul 2013 at 4:04

lhfei / hibernate-memcached

Queries with projections generate different cache keys in different JVMs #23