ehcache / ehcache3

Ehcache 3.x line
http://www.ehcache.org
Apache License 2.0
2.01k stars 579 forks source link

Buffer supplied to custom `Serializer` is not consistent #3243

Closed Naros closed 4 weeks ago

Naros commented 4 weeks ago

We have added several custom Serializer types to deal with some custom objects we need to store in Ehcache where we cannot take advantage of Serializable. However, I am noticing that the buffer supplied to Serializer#read seems to be inconsistent based on where the buffer was sourced.

When a cache implementation only specifies heap as a resource pool, the byte array serialized in the Serializer#serialize method is exactly the same series of bytes in the Serializer#read buffer. When using only disk based resource pools, the buffer supplied to Serializer#read is always prepended with a 40-byte preamble, which I assume has something to do with the disk level format or the fact the entry was loaded from disk.

This makes it really difficult to deserialize and read the data using a custom Serializer when the buffer can be either or and the cache is configured with a combination of heap & disk resource pools.

Is there a way for this to be deterministic for custom serializer or am I missing something?

Naros commented 4 weeks ago

Hi @chrisdennis, I know you spoke up in my other issue, but this one here in particular is a severe blocker for us to add any sort of ehcache support into our CDC solution. Any insight into what could be wrong?

Naros commented 4 weeks ago

Actually I found the answer after looking more closely at the ByteBuffer state passed into the read method. We specifically were doing this:

try (ByteArrayInputStream input = new ByteArrayInputStream(buffer.array()))

This didn't take into account the buffer offset that Ehcache was setting depending on how the value was loaded via the portability class layers. Once we realized that and then simply used input.skip(buffer.arrayOffset()) rather than the explicit 40-bytes, it works regardless.

Hopefully this will be helpful for others who may stumble onto this in the future.

chrisdennis commented 4 weeks ago

You might want to be careful with that code. It's not always safe to call .array() on a ByteBuffer since it may not have a backing array (could be a direct buffer). You might find it more convenient to use: org.ehcache.core.util.ByteBufferInputStream which reads (a slice) of the ByteBuffer directly.