Support Parameterized Serialization/Deserialization

jeromesimeon commented 4 years ago

Is your feature request related to a problem? Please describe.

Provide the ability to handle variations in the serialized JSON for concerto. An instance of that is a custom-serialisation aligned with the Ergo runtime (See also https://github.com/accordproject/ergo/issues/57 and https://github.com/accordproject/ergo/issues/191).

Some of those variations may have better properties for handling CTO-validated values in certain scenario. An example of useful property is for the JSON representation to be be-self describing (i.e., be able to interpret the data without having to inspect the schema). This property does not currently hold as there is no way to e.g., distinguish between { number: 1 } matching o Integer number and { number: 1 } matching o Double number.

Describe the solution you'd like

A generic framework to parameterise serialisation methods Serializer.toJSON and Serializer.fromJSON
Remove current Ergo-specific options in the serializer and instead rely on that generic framework. E.g., https://github.com/accordproject/concerto/blob/2c0b62a8638da0f2c8876b176d0f8e7f8f2d9e97/packages/concerto-core/lib/serializer/jsonpopulator.js#L226

dselman commented 4 years ago

How hard would it be to make the Ergo JSON deserialisation model aware -- looking up the field type in the model when we hit anything ambiguous (double vs integer etc)?

With the move to removing the JSON serialiser etc I think it is even more important that we stay as close to canonical JSON as possible.

jeromesimeon commented 4 years ago

How hard would it be to make the Ergo JSON deserialisation model aware -- looking up the field type in the model when we hit anything ambiguous (double vs integer etc)?

Pretty hard. Not only it means you have to navigate through the model at runtime, but you also have to handle "naked" values without a model around them (e.g., 1 + 2 or 1.2 + 2.3 which correspond to instructions on Integer or instructions on Double).

Currently the JSON representation used at runtime for logic execution is self-describing which is a really nice property to have.

jeromesimeon commented 4 years ago

Also: other backends than JavaScript will not have the luxury to do that so we don't want to make that an assumption of the compiler.

jeromesimeon commented 4 years ago

A typical non-trivial example: How do you handle the distinction between toString(1) and toString(1.0) when all you have in the runtime are JavaScript number?

Or: match x with 1 return "foo" with 1.0 return "bar" else throw...

dselman commented 4 years ago

Thanks, that helps. Could we somehow make those transformations more generic?

E.g.

{
   $class : "org.acme.Foo"
   myInteger : 10
}

Can be transformed to:

{
   $class : "org.acme.Foo"
   myInteger : {
      $class : "concerto.Integer"
      value : 10
   }
}

And then transformed back ...?

jeromesimeon commented 4 years ago

Thanks, that helps. Could we somehow make those transformations more generic?

E.g.
{
   $class : "org.acme.Foo"
   myInteger : 10
}
Can be transformed to:
{
   $class : "org.acme.Foo"
   myInteger : {
      $class : "concerto.Integer"
      value : 10
   }
}
And then transformed back ...?

That's essentially what the Ergo-specific serialization/deserialization does.

Do you want users to write that kind of JSON? -- if not, you need a distinction between user and internal representation.

This becomes even more acute for backend X different from JavaScript -- but also makes the question/distinction clearer: you will need to validate user JSON into: Java object / Haskell values / WASM memory blocks, you name it.

dselman commented 4 years ago

Makes sense.

So this to me feels like it is something that is (largely?) outside the scope of Concerto. Perhaps Concerto could assist in providing a generic transformation like the above: adding type information for any JSON values that are ambiguous: numbers, enums, DateTime? That JSON can then be handled by the respective Ergo backends, based on the execution environment.

User JSON <-> Validate & Transform <-> Concerto "Strongly Typed JSON" <-> Ergo Runtime

jeromesimeon commented 4 years ago

Makes sense.

So this to me feels like it is something that is (largely?) outside the scope of Concerto. Perhaps Concerto could assist in providing a generic transformation like the above: adding type information for any JSON values that are ambiguous: numbers, enums, DateTime? That JSON can then be handled by the respective Ergo backends, based on the execution environment.
User JSON <-> Validate <-> Strongly Typed JSON <-> Ergo Runtime

That's exactly the pipeline but really the type annotations are Concerto types, and if you were doing it outside of Concerto you would essentially replicate the serialisation code (that's what happened for a few releases) -- which creates maintainability costs/issues.

Now I actually think this could be handled really nicely with the right generic transformation (exactly as you suggest) and yes separating the validation from the transform would make that super clear I think. It's one additional reason I am excited about this functional refactoring.

jeromesimeon commented 4 years ago

I had not thought about this before, but the transform we are talking about isn't all that different from those we do in concerto-tools so we could move the Ergo-specific part there. @dselman any thoughts on that idea?

dselman commented 4 years ago

It's still a little fuzzy to me, but I think it makes sense.

I imagine a JSON transformation that copies the JSON, but maintains a stack to store the path to the current node.. E.g.. foo.bar.baz -- when it hits an ambiguous type (string or number) it looks for type information in the ModelManager for that path, and rather than just copying the source JSON, it copies some "annotated" JSON that conveys the type information.

Yes, we could put that code in concerto-tools - as it probably isn't required for most JS users.

jeromesimeon commented 4 years ago

It's still a little fuzzy to me, but I think it makes sense.

I imagine a JSON transformation that copies the JSON, but maintains a stack to store the path to the current node.. E.g.. foo.bar.baz -- when it hits an ambiguous type (string or number) it looks for type information in the ModelManager for that path, and rather than just copying the source JSON, it copies some "annotated" JSON that conveys the type information.

Yes, we could put that code in concerto-tools - as it probably isn't required for most JS users.

Yes the way that transformation code would work is a little fuzzy to me as well. I wouldn't mind a functional approach to that as well, but it's a marginal concern (mostly to avoid being confused about side effects on the source JSON).

One thought might be to have that transform work as a kind of tree transducer / reducer (where the output type could be anything -- including JSON).

The specific description for the transform might be a set of functions:

{
string: ((ctxt,s) => { ...}), // where s is the input string
double: ((ctxt,d) => {...}), // where d is the input double
array: ((ctxt,l) => {...}) // where l is the result of applying the transform to all the elements in the array
class: ((ctxt, className, fields) => {...}) // where fields is the result of applying the transform to all the fields in the class object
}

// ctxt can contains type information -- e.g., model manager, class declaration or path from root to local data object
// the output type for those functions could be anything: new JSON, a string, a number (e.g., count all the nodes in the input), etc.

Sorry, the is not a very well thought-out idea -- but a kind of general intuition.

accordproject / concerto

Support Parameterized Serialization/Deserialization #162