owlike / genson

Genson a fast & modular Java <> Json library
http://owlike.github.io/genson/
219 stars 65 forks source link

Serialization of nested polymorphic types is broken #132

Open aseovic opened 6 years ago

aseovic commented 6 years ago

I ran into a bug yesterday that had me scratching my head for quite a while, where only attributes of the statically declared polymorphic super type were being serialized, but the attributes of the runtime type were not.

Basically, for these classes:

  private static class Pet {
    public String name;
  }

  private static class Dog extends Pet{
    public String breed;
  }

  private static class Person {
    public Pet pet;
  }

the following test

  @Test
  public void testSerializeNestedPolymorphicType() {
    String expected = "{\"pet\":{\"breed\":\"Boxer\",\"name\":\"Fido\"}}";

    Dog dog = new Dog();
    dog.name = "Fido";
    dog.breed = "Boxer";

    Person p = new Person();
    p.pet = dog;

    assertEquals(expected, genson.serialize(p));
  }

fails with

org.junit.ComparisonFailure: 
Expected :{"pet":{"breed":"Boxer","name":"Fido"}}
Actual   :{"pet":{"name":"Fido"}}

Now, the "fix" is as simple as specifying builder.useRuntimeType(true) when constructing genson instance, but I'd argue that if that is the solution then the use of runtime types shouldn't be optional -- the serialization above should work regardless of how Genson is configured. As it currently stands, it doesn't.

EugenCepoi commented 6 years ago

I don't understand what the bug is. By default Genson is using the static type (the one that is available at compile time). Using the runtime type instead can be enabled via useRuntimeType(true) as you point above, in which case the static type will be ignored.

I don't get how the current behavior is not working.

aseovic commented 6 years ago

I guess my point is that while static types are useful for deserialization (if you have them, which in many cases you don't), the runtime type should always be used for serialization. I really don't see the point of having it as a configuration option, and especially of having it off by default.

The example above is about as simple as it gets, and I would expect it work out of the box, without the need for any special configuration. It's the "principle of least surprise" thing, I guess.

What is the reason you think runtime type usage should be optional during serialization, and off by default? What am I missing?

EugenCepoi commented 6 years ago

It is off by default because I considered that if we can serialize something, we want to be able to read it back in the same shape. JSON it self has no notion of polymorphism, so it seemed weird to allow by default serializing something that we won't be able to read back.

To fully support polymorphic ser/de class metadata should probably be enabled as it's the only way to read back what we wrote using runtime types.

EugenCepoi commented 6 years ago

Though I'm not strongly opposed to change the default behavior to use runtime type for serialization.

aseovic commented 6 years ago

I hear you, but just because we can round trip something without throwing an exception, doesn't mean it is correct. The above obviously isn't, because I lost both breed information and the fact that my pet is a dog along the way. If I deserialized JSON back into Person p2, assertEquals(p, p2) would fail.

In order to be able to round-trip correctly, I believe the default configuration should be equivalent to:

builder
   .useRuntimeType(true)
   .useClassMetadata(true)
   .useClassMetadataWithStaticType(false)

That ensures that class metadata is written when it needs to be and not when it's not really necessary, and that the runtime types are always used and all relevant properties serialized.

That aside, I would still expect serialization to work properly for a simple object graph I used above, regardless of configuration. There are many applications that are pure data consumers, and only care about serialization to JSON because they are exposing data managed by some backend system to clients via REST API, for example. Those types of apps don't care much about round tripping and deserialization, but they do care about correctness of the serialization.

I do agree that deserialization can get hairy when you don't have enough static type information, but there is absolutely no reason for serialization not to work, as all type information we need is there. Which is why I said earlier that we don't really need useRuntimeType option -- it should always be on, and the user should not be able to turn it off as turning it off breaks serialization.