cph-cachet / carp.core-kotlin

Infrastructure-agnostic framework for distributed data collection.
https://carp.cachet.dk/core/
MIT License
21 stars 3 forks source link

Massive problems with Kotlin JSON serialization #233

Closed bardram closed 2 years ago

bardram commented 3 years ago

I'm experiencing massive problems with the JSON coming out of the Kotlin core classes. Here are just some example.

The JSON for a MasterDeviceDeployment looks like this:

{
    "deviceDescriptor": {
        "$type": "dk.cachet.carp.protocols.domain.devices.Smartphone",
        "isMasterDevice": true,
        "roleName": "phone",
        "samplingConfiguration": {},
        "supportedDataTypes": [
            "dk.cachet.carp.geolocation",
            "dk.cachet.carp.stepcount"
        ]
    },
    "configuration": {
        "$type": "dk.cachet.carp.protocols.domain.devices.DefaultDeviceRegistration",
        "registrationCreationDate": "2021-01-21T13:50:39.391Z",
        "deviceId": "114faa76-f00d-4665-b71b-d9b9508786d3"
    },
    "connectedDevices": [],
    "connectedDeviceConfigurations": {},
    "tasks": [
        {
            "$type": "dk.cachet.carp.protocols.domain.tasks.ConcurrentTask",
            "name": "task",
            "measures": [
                {
                    "$type": "dk.cachet.carp.protocols.domain.tasks.measures.PhoneSensorMeasure",
                    "type": "dk.cachet.carp.geolocation",
                    "duration": -1000
                }
            ]
        }
    ],
    "triggers": {
        "0": {
            "$type": "dk.cachet.carp.protocols.domain.triggers.ElapsedTimeTrigger",
            "sourceDeviceRoleName": "phone",
            "elapsedTime": 10000
        }
    },
    "triggeredTasks": [
        {
            "triggerId": 0,
            "taskName": "task",
            "destinationDeviceRoleName": "phone"
        }
    ],
    "lastUpdateDate": "2021-01-21T13:50:39.391Z"
}

This has several problems:

And this is ONLY the problems with one class......

Whathecode commented 3 years ago

Serialization is an infrastructure feature. All of the "problems" you observe are there by design as the default way CARP core supports serialization. This is not a bug: serialization and deserialization is tested fully for all serializable classes on both the JS and JVM runtime.

You are not required to use these, but they remove some of the boilerplate code you would otherwise have to write.

The boilerplate code referred to here is writing a data transfer object (DTO) for each of the classes which are used by the application services in core, and setting up a serialization library to make them serializable so they may be sent across the wire. This work right now is lifted from our webservices implementation by core. But, it does not prevent our webservices implementation from modifying certain aspects of serialization if required for our specific infrastructure.

Unfortunately, there is no "standard" way to serialize things. Different libraries may output things differently, and even a single library can be configured to serialize things differently (that's a good thing, how else can you make separate libraries in separate languages talk to each other). Polymorphic serialization (the $type field) is a prime example of a lack of standards in JSON serialization.

I will thus interpret this bug report as a question on how certain things are serialized in CARP core, and, potential feature requests to change the default way things are serialized. The comments below are common to all serialization in CARP core; not specific to just this one class.

However, this does hint at a lack of documentation/specification for the default serialization format outside of the Kotlin codebase. I created a new issue for this: Create schema/specification for serializable objects

Polymorphic serialization

There is inconsistency in using the $type -- some uses it, some don't and it's impossible to know when.

$type is added as a field by kotlinx.serialization when the type which needs to be serialized is polymorphic.

Concretely, configuration in the example JSON you gave is of type DeviceRegistration, which is an abstract class. The type information is added so that the deserializer knows which concrete type to load. Conversely, TriggeredTask is a concrete class.

In short, only types which expect one of many potential instances contains the $type field. In theory it should be possible to add a $type field to all JSON objects (note that you can't do this for primitives, only array polymorphism would support that), but it is not necessary for polymorphism to work; at least not when using kotlinx.serialization. I understand it may be a problem configuring your serializer in Dart as such, which is why I filed this feature request to kotlinx.serialization before. I can try to resurface the issue there.

Serialization of primitives

The DataType is serialized to a String, and not to the definition of a namespace and a name

Even though types are classes in source code, does not always warrant them of being serialized as JSON objects. DataType is a nice example of that. DataType is nothing but an identifier, as we decided early on in the project.

... DataType [has] the sole purpose of uniquely identifying the data (how it should be uploaded, what data can be measured). This also answers the question 'how should these be serialized', as all that remains is the identifier.

You can see DataType is simply a typealias of NamespacedId, for which a custom 'primitive' serializer is defined which simply serializes it as a String.

UUID is similar. We don't want to serialize UUID as:

{
  "$type": "dk.cachet.carp.common.UUID",
  "stringRepresentation": "c19e5d56-6add-404c-9bbe-f2ae77737e30"
}

TriggeredTask

TriggeredTask does not in any way serialize as defined... where does the triggerId, taskName, etc. come from?

You are looking at the wrong source code. The TriggeredTask in MasterDeviceDeployment is the DTO which is serialized. It is a normalized form of the domain object representation (with full references) you are linking to.

Serializing collections

The Int in the triggers map serialize as a String (e.g., "0") and not as an integer.

Maps in Kotlin are serialized as JSON objects. triggers is a Map. The keys in the map are mapped to JSON keys and the values to JSON values.

JSON keys have to be strings in order to comply with JSON standards.

Whathecode commented 2 years ago

However, this does hint at a lack of documentation/specification for the default serialization format outside of the Kotlin codebase. I created a new issue for this: Create schema/specification for serializable objects

I presume that with the addition of these JSON schemas, this will become much clearer and this issue can be closed.