Closed RuedigerMoeller closed 10 years ago
Are you using this method in Lang directly or via an HFT collections class ? such as SharedHashMap ?
On 27 May 2014, at 16:38, RuedigerMoeller notifications@github.com wrote:
Hi Peter,
As we have hundreds of datastructures, we go down the serialization route. I need a way to efficiently plug in a custom serializer.
checking AbstractBytes
public void writeObject(@Nullable Object obj) { if (obj == null) { writeByte(NULL); return; }
Class<?> clazz = obj.getClass(); final BytesMarshallerFactory bytesMarshallerFactory = bytesMarshallerFactory(); BytesMarshaller em = bytesMarshallerFactory.acquireMarshaller(clazz, false); if (em == NoMarshaller.INSTANCE && autoGenerateMarshaller(obj)) em = bytesMarshallerFactory.acquireMarshaller(clazz, true); if (em != NoMarshaller.INSTANCE) { if (em instanceof CompactBytesMarshaller) { writeByte(((CompactBytesMarshaller) em).code()); em.write(this, obj); return; } writeByte(ENUMED); writeEnum(clazz); em.write(this, obj); return; } writeByte(SERIALIZED); // TODO this is the lame implementation, but it works. try { ObjectOutputStream oos = new ObjectOutputStream(this.outputStream()); oos.writeObject(obj); } catch (IOException e) { throw new IllegalStateException(e); } checkEndOfBuffer(); }
Is there a way to basically shortcut this routine (e.g. also avoid
Class<?> clazz = obj.getClass(); final BytesMarshallerFactory bytesMarshallerFactory = bytesMarshallerFactory(); BytesMarshaller em = bytesMarshallerFactory.acquireMarshaller(clazz, false); as this lookup can be like 10% percent in case of smallish objects. If we can agree on some 'pluggable' interface I can do the work and contribute. Or is there another way to customize serialization ?
regards, Rüdiger
— Reply to this email directly or view it on GitHub.
I want to use SharedHashMap. As fast-serialization does some trickery to avoid hash lookups and potenitally blur locality like object.getClass() and instanceof, serialization performance of small objects might be affected if serialization is called as last resort, so I need a hook which kicks in earlier. This can be quite notable when putting smallish objects using serialization.
@RuedigerMoeller There is BytesMarshallable
and there is a fast path for writing it both in VSHM and AbstractBytes.writeInstance()
for writing it.
If I understand your idea right.
I disagree. BytesMarshallable requires a lot of changes to existing code. Think of a system with hundreds of datastructures. Nobody is willing to pay the price for custom/hand written serialization. So I need to use object serialization. I have a very well performing implementation of generic objectserialization which I want to plug in. BytesMarshallable does not cut it. And even if I patch out the ObjectSerialization, the path still is:
if (BytesMarshallable.class.isAssignableFrom(objClass)) {
((BytesMarshallable) obj).writeMarshallable(this);
} else if (Externalizable.class.isAssignableFrom(objClass)) {
((Externalizable) obj).writeExternal(this);
} else if (CharSequence.class.isAssignableFrom(objClass)) {
writeUTFΔ((CharSequence) obj);
} else {
writeObject(obj);
}
I mean the instanceof chain adds serious overhead when serializing small objects (which FST does in the area of some 100 nanos if used/tuned right). Additionally you grep String objects away from serialization .. I would need basically a plug to completely replace the decision tree for en/decoding, anyway I can fork or write a wrapper. Just some input from someone evaluating this ..
@RuedigerMoeller you mean adding methods like customKeySerialization(BiConsumer<Bytes, K> serializer)
and for value accordingly to SharedHashMapBuilder
API would be useful?
@RuedigerMoeller
you may find this interface useful
net.openhft.collections.ReplicatedSharedHashMap.EntryExternalizable
its implemented by :
net.openhft.collections.VanillaSharedReplicatedHashMap
To make it completely pluggable you can avoid using writeObject() all together. You can use instead the OutputStream/InputStream, or write you own serializer/deserializer which writes/reads the data how you wish. writeObject is provided as a convenience, but if it doesn't do what you need, don't call it.
On 27 May 2014 21:02, Roman Leventov notifications@github.com wrote:
@RuedigerMoeller https://github.com/RuedigerMoeller you mean adding methods like customKeySerialization(BiConsumer<Bytes, K> serializer) and for value accordingly to SharedHashMapBuilder API would be useful?
— Reply to this email directly or view it on GitHubhttps://github.com/OpenHFT/HugeCollections/issues/24#issuecomment-44327551 .
@BoundedBuffer ReplicatedSharedHashMap.EntryExternalizable
is between memory and wire, @RuedigerMoeller is talking about serializations between native Java and memory.
@peter-lawrey the problem is that VSHM do call writeObject()
inside.
I was thinking of Chronicle, where is entirely a choice. ;)
If you want to avoid looking up a marshaller for each class, what is the alternative you want to use? Can you use a mutable wrapper?
Map<String, MyBytesMarshallableRef> map = shared map. MyBytesMarshallableRef ref = new MyBytesMarshallableRef();
ref.value = myRandomType.
map.put(key, ref);
if (map.getUsing(key, ref) != null) { // ref is set and found
}
On 27 May 2014 21:29, Roman Leventov notifications@github.com wrote:
@peter-lawrey https://github.com/peter-lawrey the problem is that VSHM do call writeObject() inside.
— Reply to this email directly or view it on GitHubhttps://github.com/OpenHFT/HugeCollections/issues/24#issuecomment-44330969 .
Yes - agreed, VSHM does call writeObjetc() but EntryExternalizable does not call writeObject().
Sent from my iPad
On 27 May 2014, at 21:29, Roman Leventov notifications@github.com wrote:
@peter-lawrey the problem is that VSHM do call writeObject() inside.
— Reply to this email directly or view it on GitHub.
Correct me if I oversee something. My intention is to completely replace Bytes<=>Object transformation (e.g. avoid per-class marshaller lookup and instanceof-chain).
@Peter
If you want to avoid looking up a marshaller for each class, what is the alternative you want to use? Can you use a mutable wrapper?
Just provide a delegation mechanism. The lookup can be avoided as frequently all values have same type, so a custom serializer could cache a marshaller. Seems ridiculous, but hash lookups always add to cache pollution. As encoding is the main performance bottleneck for offheap storage (If one has to deal with random serializable classes), a lot of trickery can be done to speed up (e.g. pre-known objects which are encoded by e.g. a short, partial/lazy decoding etc.).
@BoundedBuffer - I am not too deep into the HFT classes, so I am not aware of the role of entryexternalizable. have to figure out ;)
@leventov
you mean adding methods like customKeySerialization(BiConsumer<Bytes, K> serializer) and for >value accordingly to SharedHashMapBuilder API would be useful?
Yep, something along the lines of this. Does this exist ?
I thought about it the night and maybe I am better off completely wrapping the map and just put byte arrays or Bytes from the wrapper (unfortunately each library has its own flavour of Bytez abstraction ..). On the other hand your shared map could get a significant speed boost if custom serialization is pluggable.
BTW thanks for quick feedback :-)
@RuedigerMoeller
Does this exist ?
Not yet. We will consider adding such thing, thanks for the idea.
@RuedigerMoeller What are your time scales for this, we've added a task on our internal JIRA system to add this functionality.
HCOLL-91 SHM key/value serializer abstraction (for configuration and speed)
Awesome ! If it comes within say 3 month its ok for me.
OK - We'll aim for that.
Hi Peter,
As we have hundreds of datastructures, we go down the serialization route. I need a way to efficiently plug in a custom serializer.
checking AbstractBytes
Is there a way to basically shortcut this routine (e.g. also avoid
as this lookup can be like 10% percent in case of smallish objects. If we can agree on some 'pluggable' interface I can do the work and contribute. Or is there another way to customize serialization ?
regards, Rüdiger