Open jroper opened 6 years ago
+1 for ByteBuffer.
+1 for ByteBuffer.
Given an entity such as Person
and my message is a JSON binary, how to deserialize information without defining a class such as Person.class
?
Maybe, a method that passes a class such as:
public interface MessageSerializer {
byte[] fromMessage(Message<?> message);
<T> Message<T> toMessage(byte[] rawData);
<T> Message<T> toMessage(byte[] rawData, Class<T> entityClass);
}
@Azquelt - lets use this issue for type conversion, we have pushed standardizing it out to 1.1. I would love to integrate with MicroProfile Config type converters if possible.
+1 to what @otaviojava suggested with the toMessage(byte[] rawData, Class<T> entityClass)
method. In order to let this integrate with databinding frameworks (e.g. JSON-B or Jackson) the framework needs to know what Java type to convert the data into. I think we need something equivalent to MessageBodyReader and MessageBodyWriter from JAX-RS
Migrated from https://github.com/eclipse/microprofile-reactive-streams/issues/42
Comment by @jroper: We need a serialization abstraction for messaging.
Comment by @olegz I am assuming serialization here implies to/from universal "wire" format such as
byte[]
. Correct? I want to make sure that there is a distinction between general type conversion (i.e., payload conversion to a type requested by a handler)Comment by @jroper: Serialization is the conversion of a payload to/from the type requested by the handler.
I don't think we need to have a universal wire format to do this (eg, one messaging provider may provide payloads as bytes, another as strings, and another as a JSON like tree structure), we may want to make the serializer abstraction flexible enough to support whatever the underlying messaging provider can or does offer. Obviously bytes will be common, but the problem with bytes is that some messaging providers support string, and a string can't be represented as bytes alone, it can only be represented as bytes + charset, so it would be better to offer direct deserialization from/serialization to strings and allow the messaging provider to handle encoding to bytes however it wants.
Comment by @olegz: James, IMHO there is a clear distinction between type serialization and type conversion. While they may look alike, semantically they are radically different. (De)Serialization implies reading from or writing to some type of storage or transport format which can only be
byte[]
. Even the systems you are referring to that deal with Strings simply means they have some internal mechanism to deal withbyte[]
. But they are also responsible to infer the charset.On the other hand type conversion simply means transform payload of a message from whatever it currently is to whatever type required by a handler operation (i.e., from Foo to Bar). Sure, the same converters can deal with converting to/from wire format, but that is implementation detail. All I want to communicate is that I personally draw a clear distinction between SerDe and type conversion.
Comment by @jroper: Ok, so if we're talking about type conversion, would should we name the type converters? Are there examples of APIs that we can model the naming off?
Comment by @olegz: Yes, just as an example, in Spring we have (
MessageConverter
)[https://github.com/spring-projects/spring-framework/blob/master/spring-messaging/src/main/java/org/springframework/messaging/converter/MessageConverter.java] abstraction. Yes we do use it for both cases (to/frombyte[]
as well as other types), but that is the implementation detail.Semantically when I hear serialization I hear storage and/or transport and personally I would love to see the distinction to be more clear.
You can also see similar approach in Kafka with the exception that they've separated
Serializer
andDeserializer
Comment by @jroper: Thanks for the links. I like the shape of the
MessageConverter
API, I was thinking something with the same method signatures.Do you have any thoughts on separating the two directions? My thought on this is if you are only consuming a message of a certain type, and not producing it, then there's no need to write the conversion for both directions (of course, for the most part it'll be handled automatically, in the MicroProfile case by JSON-B by default, but even for other formats, eg protobuf, it can be defined generically and so doesn't need users to implement their own converters, rather they can just reference out of the box or converters provided by third party libraries). If we were to separating them, then the name Converter probably won't work, since it doesn't imply a direction and doesn't have an opposite. In that case, perhaps marshall/unmarshall? This is consistent with JAXB terminology. bind/unbind could also be used, but that usually implies some existing object/tree structure like JSON that you're binding your objects to/from, which in some cases might be the case, but in most cases, there's going to be a parse/format stage before binding/unbinding is done.
On the difference between type conversion and serialization, in your example code above, it looks like message headers etc are included in the output bytes/parsed from the input bytes, is that right? Because if that's the case then Kafka's serializers are actually type converters that just happen to always work with
byte[]
, because they don't expect you to parse/format the headers onto the wire.Last thing, any opinion on
byte[]
vsByteBuffer
? The latter has optimizations overbyte[]
and offers a small amount of extra safety in that it can be read only.Comment by @olegz: "doesn't need users to implement their own converters. . ." - that is pretty much the thinking. Also with Java 8
default it
would be easy to NOT force user to implement what doesn't have to be implemented.With regard to separating serializer vs type-converters . . . let's just say that Serializers are specialized type-converters that always deal with to/from
byte[]
type, so there may be some API simplifications.And yes Kafka would be a good analogy of type-converters that always deal with to/from
byte[]
. Yes Kafka now provides native support for headers. In previous versions of Kafka we've implemented our own way of embedding headers, which users/systems can choose to implement. But the idea is that headers should be simple key/values or primitive types, thus easily embeddable and extracted into a Message by the serializers.With regard to
byte[]
vsByteBuffer
I am ok withByteBuffer
primarily for 'read only' reasons.