Open ghost opened 1 year ago
The difference is in the polymorphic strategy. By default, all polymorphic classes are encoded as in the CBOR: array of [type, object]
. However, since this is a non-standard representation for Json, it has special support for polymorphism that is enabled by default with Json { useArrayPolymorphism = false }
flag. Other formats don't usually support this flag. To achieve what you want, you either need to support special polymorphism format in CBOR or to turn off the aforementioned Json flag.
Also, note that CBOR does not really save much space, as keys are still encoded as strings in utf-8. Perhaps you just want to use a better serializer for ByteArray
in Json.
Thanks a lot @sandwwraith for the explanation! Maybe I'll have a look at implementing this myself during the course of Hacktoberfest 👍
The feature itself is reasonable, but I would like to warn any of the potential contributors here -- it has a a lot of work in it, mostly because it requires format to be able to read/skip through an arbitrary number of elements of various nesting levels prior to finding type
discriminator
Hello @qwwdfsad,
thanks for the hint! I took a look at the code and tried to implement something, but indeed this seems to be a bigger task. I'd like to keep trying to implement this, but any pointers on where to start or what to look for would be greatly appreciated!
@jsiebert You can take a look at StreamingJsonDecoder.decodeSerializableValue
https://github.com/Kotlin/kotlinx.serialization/blob/0a1b6d856da3bc9e6a19f77ad66c1b241533a20b/formats/json/commonMain/src/kotlinx/serialization/json/internal/StreamingJsonDecoder.kt#L53
There are several takeaways:
if (deserializer is AbstractPolymorphicSerializer<*>)
and then add custom behavior.type
key for polymorphism may be in arbitrary place inside this map.type
key to load serializer. If it is not there, it falls back to default behavior. This optimization is probably not necessary for CBOR.type
key in arbitrary object in arbitrary place, one probably needs intermediate data structure. For Json, this is JsonElement
— first object string is parsed to JsonElement
, then type
key extracted, then the rest of JsonElement
is parsed to an actual Kotlin object using separate JsonTreeDecoder (decoder is separate because its input is JsonElement, not String).JsonElement
and JsonTreeDecoder
in Cbor (yet).type
key, then to deserialize to Kotlin object with actual deserializer. It will probably be much simpler, but this is for you to find out.Hope this helps. Good luck!
This is more a question than a feature request...I'm working in an environment where JSON messages are being published from an Android App, written in Kotlin, via MQTT, to a Python-based backend where these messages are being decoded and processed. Serialization in the Android App is done with kotlinx.serialization, of course... :wink:
The messages are being serialized from a wrapper class which is implemented as follows:
The 'real' message content is store in the
msg
property of this wrapper and derived from theExternalMessageBase
class:Now I do have one specific message type, which contains image data, that I don't want to encode in JSON but in CBOR to keep the message size minimal and to get rid of encoding the image data to a base64 string in the App. The class for this message is implemented as follows:
The encoding of this message in JSON results in a slightly different structure than encoding in CBOR (this output has been generated using json.loads/cbor2.loads on the Python side):
(JSON)
(CBOR)
As one can see, in the JSON output, the
ImageDataMessage
is encoded into one map which also contains thetype
attribute whereas in CBOR theImageDataMessage
is encoded into a list which contains thetype
attribute and a map which contains the remainder of theImageDataMessage
object.What I would like to achieve is that the result of the serialization for CBOR is the same as for JSON because that would prevent implementation of a big amount of changes to the processing logic in the Python-based backend. Ideally, I would just replace
with
in my Python code and the processing logic works the same regardless of the format the message was encoded to.
Is this achievable somehow using kotlix.serialization e.g. by changing the CBOR configuration or the implementation of the Message classes?
Thanks in advance!