FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
316 stars 136 forks source link

`jackson-dataformat-ion` type serialization behavior #234

Open jobarr-amzn opened 3 years ago

jobarr-amzn commented 3 years ago

jackson-dataformat-ion polymorphic behaviors

SSCCE to demonstrate behaviors.

This sample code depends on jackson-dataformat-ion, jackson-annotations, jackson-databind, and jackson-core.

import com.amazon.ion.IonValue;// for jackson-dataformat-ion 2.10+
// import software.amazon.ion.IonValue; // for jackson-dataformat-ion 2.8-2.9
import com.fasterxml.jackson.annotation.JsonTypeInfo;
import com.fasterxml.jackson.dataformat.ion.IonObjectMapper;

import java.io.IOException;
import java.awt.Point;

public class Main {
    @JsonTypeInfo(use = JsonTypeInfo.Id.CLASS, include = JsonTypeInfo.As.PROPERTY, property = "@class")
    static public class BaseClass extends Point {
        public BaseClass(int x, int y) { super(x, y); }
    }

    public static class Subclass extends BaseClass {
        public Subclass(int x, int y) { super(x, y); }
    }

    public static void main(String... a) throws IOException {
        IonObjectMapper mapper = new IonObjectMapper();

        Subclass subclass = new Subclass(10, 42);
        Point unrelated = new Point(10, 42);

        // By default no type annotation => no type information
        IonValue unrelatedAsIon = mapper.writeValueAsIonValue(unrelated);
        report(unrelatedAsIon);

        // Form of serialized information will depend on jackson-dataformat-ion version
        IonValue subclassAsIon = mapper.writeValueAsIonValue(subclass);
        report(subclassAsIon);

        // This will fail for jackson-dataformat-ion >= 2.9
        BaseClass roundTripInstance = mapper.readValue(subclassAsIon, BaseClass.class);
        assert (roundTripInstance instanceof Subclass);
        assert (subclass.equals(roundTripInstance));
    }

    private static void report(IonValue v) {
        System.out.printf("%s%n", v.toPrettyString());
    }
}

Available jackson-dataformat-ion behaviors:

v2.8

We need to modify the example to work with 2.8:

- import com.amazon.ion.IonValue;// for jackson-dataformat-ion 2.10+
+ // import com.amazon.ion.IonValue;// for jackson-dataformat-ion 2.10+
- // import software.amazon.ion.IonValue; // for jackson-dataformat-ion 2.8-2.9
+ import software.amazon.ion.IonValue; // for jackson-dataformat-ion 2.8-2.9

Afterwards it should yield:

Output


{
  x:10e0,
  y:42e0
}

{
  '@class':"Main$Subclass"
  x:10e0,
  y:42e0
}

v2.9 - Present (2.12 release)

Assuming 2.10+ for consistent IonValue import statement.

Output


{
  x:10e0,
  y:42e0
}

Main$Subclass::{
  x:10e0,
  y:42e0
}
Exception in thread "main" com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Missing type id when trying to resolve subtype of [simple type, class Main$BaseClass]: missing type id property '@class'
 at [Source: UNKNOWN; line: -1, column: -1]
    at com.fasterxml.jackson.databind.exc.InvalidTypeIdException.from(InvalidTypeIdException.java:43)
    at com.fasterxml.jackson.databind.DeserializationContext.missingTypeIdException(DeserializationContext.java:1790)
    at com.fasterxml.jackson.databind.DeserializationContext.handleMissingTypeId(DeserializationContext.java:1319)
    at com.fasterxml.jackson.databind.jsontype.impl.TypeDeserializerBase._handleMissingTypeId(TypeDeserializerBase.java:303)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedUsingDefaultImpl(AsPropertyTypeDeserializer.java:166)
    at com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:107)
    at com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeWithType(BeanDeserializerBase.java:1197)
    at com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:68)
    at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4482)
    at com.fasterxml.jackson.dataformat.ion.IonObjectMapper.readValue(IonObjectMapper.java:154)
    at Main.main(Main.java:33)

After PR #232

Output


{
  x:10e0,
  y:42e0
}

Main$Subclass::{
  x:10e0,
  y:42e0
}

After PR #232, with native types disabled at Mapper construction

- IonObjectMapper m = new IonObjectMapper();
+ IonObjectMapper m = new IonObjectMapper()
+  .disable(IonGenerator.Feature.USE_NATIVE_TYPE_ID);

Output


{
  x:10e0,
  y:42e0
}

{
  '@class':"Main$Subclass"
  x:10e0,
  y:42e0
}

Open Question: What is the ideal behavior?

After PR #232 both achievable IonMapper configurations won't serialize type data without e.g. a POJO annotation, but write behavior can seem a little surprising. A user who has attempted to configure property-based serialization of type information will be surprised to see it showing up as an annotation instead.

An argument can be made that the most locally or specifically expressed user preference should control behavior, in which case a @JsonTypeInfo annotation should override format-native behavior. I.e. the serialization of Subclass above should naturally be:

{
  '@class':"Main$Subclass"
  x:10e0,
  y:42e0
}

This is complicated by the fact that As.Property is the default for @JsonTypeInfo. Take this example:

@JsonTypeInfo(use=Id.Class)
public class BaseClass {}

The user has not specified include or as and it's not obvious that the user does not want format-native type serialization behavior.

This behavior doesn't emerge from jackson-dataformat-ion directly, but from JsonGenerator in jackson-core, AsPropertyTypeSerializer in jackson-databind, etc.

I'm looking for some commentary on intent here, and what options might be available. It's not clear to me how much flexibility there is here.

cowtowncoder commented 3 years ago

Ok. Just to make sure I understand this correctly, the concept of native type ids is somewhat orthogonal to @JsonTypeInfo: it was added after @JsonTypeInfo existed. Or maybe better way is to say that @JsonTypeInfo indicates logical concept of type id, and native/non-native distinction (for formats that have native type ids) is separate questions. As such @JsonTypeInfo has no role to play in decision: it has nothing to indicate such handling at this point.

Streaming API, on the other hand, exposes existence of native type ids. But due to existing design, it can only indicate it globally (use native for all or none), not for specific types. But even if @JsonTypeInfo (or some new possibly format-specific annotation) could indicate use (or not) of native mechanism on per-type basis, databind would not currently have any way to connect that in.

With that, the current choice really is that user needs to indicate usage on per-call (read, write) basis: default setting exists for ObjectMapper (or more accurately, underlying TokenStreamFactory, passing it to parser/generator being created), and can be changed on per-call basis.

Generator (subtype of JsonGenerator) has bit more say on actual inclusion of type id, with writeTypeId() (native) and writeTypePrefix()/writeTypeSuffix() (non-native) calls.

I am not sure if that helps in figuring out what to do with the change discussed here but thought it might help at least understand constraints.

cowtowncoder commented 3 years ago

One quick question/note here: I am hoping to get 2.12.1 released in about 1 week (on Dec 29 or 30), and I think that this issue itself is not a blocked since the default behavior would be the same as that of 2.12.0. Another thing to note is that since this module was only officially release as part of 2.9 (although backdated versions exist) so in a way behavior of 2.9.0 could be considered canonical. That does not really help anyone who started with an earlier version of course...

Anyway: I guess what I am asking is just that if there is something urgent wrt 2.12.1 itself (regarding #232 that added setting to allow avoiding writing of native type ids), please add a note here (or file separate issue for specific change; I think this is bigger question altogether).

jobarr-amzn commented 3 years ago

is something urgent wrt 2.12.1 itself

With #232 we have a path forward. This question isn't urgent- I don't even know for sure that it has a satisfying answer. I don't know from Jackson's perspective what's feasible or advisable here so I'm floating this issue to raise a discussion.

Another thing to note is that since this module was only officially release as part of 2.9 (although backdated versions exist) so in a way behavior of 2.9.0 could be considered canonical.

Agreed 100%, I make the same argument myself. jackson-dataformat-ion default behavior shouldn't change from the only released behavior it has ever had.

That does not really help anyone who started with an earlier version of course...

True :) If upgrading from the unreleased ("2.8") type serialization behavior to jackson-dataformat-ion 2.12 you have to make sure that all services pick up necessary code changes or we end up with mixed mode fleets during deployment etc. etc.

I don't understand how all the pieces fit together here yet- it doesn't seem to me that @JsonTypeInfo is orthogonal to native type ids since use of the annotation can specify exactly how type information should be encoded and without the annotation no type information is encoded at all. It feels off that specifying "include type information as a property named '@class'" can result in a behavior different from that. I still don't get all the constraints though.

cowtowncoder commented 3 years ago

Sounds good.

As to Native Type Id vs @JsonTypeInfo: I agree that conceptually they should be connected; what I tried to explain is how pieces as they are currently do not connect. Since @JsonTypeInfo was added first, it has no awareness of possibility of native type id. There are other complications too, since Jackson annotations are mostly format-agnostic, including @JsonTypeInfo; and native type id support can be conditional as well as format-specific. But the most pertinent part I guess is that none of @JsonTypeInfo metadata (properties it has, representation within databind) is related to possibility of native type id.

That is the situation at this point, anyway. I am open to improvement suggestions, as always.

As to path forward it sounds like this issue does not block 2.12.1, and that #232 should be included even if it is likely incomplete solution wrt problem of using (or not) of native type ids.