crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.44k stars 1.62k forks source link

Serialization of constant fields, and extensible discriminators #11894

Open HertzDevil opened 2 years ago

HertzDevil commented 2 years ago

JSON::Serializable.use_json_discriminator has a few problems:

To address these problems, we first consider the case when there is exactly one concrete type; the discriminator then reduces to a single constant field. This field shall be deserialized and serialized even when a corresponding instance variable does not exist. This is still useful in situations where a JSON format might define a magic constant and require all documents to include that constant. A sensible place to declare the constant field is via the JSON::Serializable::Options annotation:

@[JSON::Serializable::Options(constants: {type: "point"})]
class Point
  include JSON::Serializable
  property x : Int32
  property y : Int32
end

x = Point.from_json %({"type":"point","x":1,"y":2})
x         # => #<Point:0x7f8d332b6e90 @x=1, @y=2>
x.to_json # => {"type":"point","x":1,"y":2}

Point.from_json %({"type":"poin","x":1,"y":2}) # raises JSON::SerializableError
Point.from_json %({"type":false,"x":1,"y":2})  # raises JSON::SerializableError
Point.from_json %({"x":1,"y":2})               # raises JSON::SerializableError

constants must be a HashLiteral or a NamedTupleLiteral, mapping field names to their expected constant values. This implies every type may use multiple discriminator fields, should the need arise. These literals are chosen because they look like JSON objects.

With this, disjoint unions now work out of the box, no extra converters required:

@[JSON::Serializable::Options(constants: {type: "circle"})]
class Circle
  include JSON::Serializable
  property x : Int32
  property y : Int32
  property radius : Int32
end

alias Shape = Point | Circle

shapes = Array(Shape).from_json <<-JSON
[
  {
    "type": "point",
    "x": 2,
    "y": 8
  },
  {
    "type": "circle",
    "x": 4,
    "y": 7,
    "radius": 5
  }
]
JSON
shapes         # => [#<Point:0x7eff847a8e20 @x=2, @y=8>, #<Circle:0x7eff847abe20 @x=4, @y=7, @radius=5>]
shapes.to_json # => [{"type":"point","x":2,"y":8},{"type":"circle","x":4,"y":7,"radius":5}]

Next, we extend this behaviour to all abstract classes as well: the result of T.from_json(json) is simply the first concrete, possibly indirect, subclass U of T such that U.from_json(json) returns successfully. This is probably the only sensible interpretation for T.from_json with an abstract T. This could be done in e.g. new_from_json_pull_parser, which currently fails to compile because T.allocate does not exist; the only way to get the new behaviour is to remove any uses of use_json_discriminator in the abstract superclass T. Then we arrive at the following reimplementation of use_json_discriminator's example code:

abstract class Shape
  include JSON::Serializable
end

@[JSON::Serializable::Options(constants: {type: "point"})]
class Point < Shape; ...; end

@[JSON::Serializable::Options(constants: {type: "circle"})]
class Circle < Shape; ...; end

Shape.from_json %({"type":"point","x":2,"y":8})             # => #<Point:0x7fee792b7e40 @x=2, @y=8>
Shape.from_json %({"type":"circle","x":4,"y":7,"radius":5}) # => #<Circle:0x7fee792bae20 @x=4, @y=7, @radius=5>

Likewise, to support modules we check on all includers of T, possibly indirect ones, that are classes. Unfortunately, this isn't as straightforward because only TypeNode#all_subclasses exists, but not #all_includers. JSON::Serializable's included hook would also have to be slightly adjusted to support module includers.

In the above snippets, Shape never needs to be aware that its subtypes use a discriminator field; the types that need discriminators define them right at their own declarations. Indeed, some hierarchies do not require a discriminator at all, because the union interpretation is sufficient. The following shall be supported too:

abstract class Foo
  include JSON::Serializable
end

class Bar1 < Foo
  property x : Int32
end

class Bar2 < Foo
  property x : String
end

Array(Foo).from_json %([{"x":1},{"x":""}]) # => [#<Bar1:0x7f668d3f3e50 @x=1>, #<Bar2:0x7f668d3f2d60 @x="">]

Going in the opposite direction, this treatment now supports discriminators over multiple field names, which are sometimes seen in the wild: (we assume the discriminators are mutually exclusive, or we could include JSON::Serializable::Strict)

@[JSON::Serializable::Options(constants: {is_point: true})]
class Point < Shape; ...; end

@[JSON::Serializable::Options(constants: {is_circle: true})]
class Circle < Shape; ...; end

shape = Shape.from_json %({"is_point":true,"x":2,"y":8})
shape         # => #<Point:0x7f8a354a8e40 @x=2, @y=8>
shape.to_json # => {"is_point":true,"x":2,"y":8}

shape = Shape.from_json %({"is_circle":true,"x":4,"y":7,"radius":5})
shape         # => #<Circle:0x7f8a354abec0 @x=4, @y=7, @radius=5>
shape.to_json # => {"is_circle":true,"x":4,"y":7,"radius":5}

A proof-of-concept implementation is available here. Float and null constants are supported on top of what we have now, but Crystal constants are not ready, nor is YAML serialization. Also it might be worth pre-fetching the constant fields in new_from_json_pull_parser / new_from_yaml_node to speed up the dispatch to subtypes, instead of reading the same constants over and over again in every subtype. Note that the compiler and the standard library themselves don't use discriminators.

Related: #8473

cyangle commented 2 years ago

Is it possible to annotate the constants directly?

class Point
  include JSON::Serializable
  @[JSON::Constant(key: "type")]
  TYPE = "point"
  property x : Int32
  property y : Int32
end
HertzDevil commented 2 years ago

There is no way to access annotations on constants, as their TypeNodes are inaccessible in the macro language.

cyangle commented 2 years ago

Anyway, this is a brilliant idea. I hope crystal could get it soon.

kostya commented 1 year ago

i dont think this is good, because, in 1 project i have hundreds classes, and generate use_json_discriminator in macro automatically, but with this i need to type such things all the time.

@[JSON::Serializable::Options(constants: {type: "point"})]
class Point < Shape; ...; end

@[JSON::Serializable::Options(constants: {type: "circle"})]
class Circle < Shape; ...; end
HertzDevil commented 1 year ago

You can attach annotations by reopening the classes too. In particular you can reopen the current class with something like:

class Foo
  {% begin %}
    @[JSON::Serializable::Options(...)]
    class ::{{ @type }}
    end
  {% end %}
end

So if you can use a macro to generate the use_*_discriminator call, you can use a macro to generate those Options annotations.

(In fact I wonder if this issue can be implemented on top of use_*_discriminator this way)

kostya commented 1 year ago

i do it like this:

abstract class Bla
  macro finished
    use_json_discriminator("type", {
      {% for subclass in @type.all_subclasses %}
        {{subclass.name.split("::").last.underscore}} => {{subclass}},
      {% end %}
    })
  end
end