FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
310 stars 133 forks source link

[Avro] `JsonMappingException` for union types with multiple Record types #168

Closed amorimjuliana closed 5 years ago

amorimjuliana commented 5 years ago

I believe this is a different issue from #123 and #164.

In the following example, we have a union schema with 5 record schemas and an abstract class annotated with @Union as follows:

@Union({
    ClassA.class,
    ClassB.class,
    ClassC.class,
    ClassD.class,
    ClassE.class
})
abstract class AbstractClass {
    public static final Schema SCHEMA = Schema.createUnion(
        ClassA.SCHEMA,
        ClassB.SCHEMA,
        ClassC.SCHEMA,
        ClassD.SCHEMA,
        ClassE.SCHEMA
    );
}

class ClassA extends AbstractClass {}
class ClassB extends AbstractClass {}
class ClassC extends AbstractClass {}
class ClassD extends AbstractClass {}
class ClassE extends AbstractClass {}

Problem 1: serialize as a concrete class and deserialize as an abstract class

final byte[] bytes = avroMapper
    .writer(new AvroSchema(ClassA.SCHEMA))
    .writeValueAsBytes(new ClassA());

final ClassA result = avroMapper
    .readerFor(AbstractClasss.class)
    .with(new AvroSchema(AbstractClass.SCHEMA))
    .readValue(bytes); // Error

When we serialize a value using the concrete class ClassA and then try to deserialize it as the AbstractClass, we get the following error:

com.fasterxml.jackson.core.JsonParseException: Invalid index (36); union only has 5 types

    at com.fasterxml.jackson.dataformat.avro.deser.UnionReader._decodeIndex(UnionReader.java:66)
    at com.fasterxml.jackson.dataformat.avro.deser.UnionReader.nextToken(UnionReader.java:36)
    at com.fasterxml.jackson.dataformat.avro.deser.RootReader.nextToken(RootReader.java:31)
    at com.fasterxml.jackson.dataformat.avro.deser.AvroParserImpl.nextToken(AvroParserImpl.java:98)
    at com.fasterxml.jackson.databind.ObjectReader._initForReading(ObjectReader.java:355)
    at com.fasterxml.jackson.databind.ObjectReader._bindAndClose(ObjectReader.java:1596)
    at com.fasterxml.jackson.databind.ObjectReader.readValue(ObjectReader.java:1234)

Although serializing and then deserializing the class using the concrete class ClassA works.

Problem 2: serialize as an abstract class

final byte[] bytes = avroMapper
    .writer(new AvroSchema(AbstractClasss.SCHEMA))
    .writeValueAsBytes(new ClassA()); // Error

Also, the serialization using the abstract class AbstractClass does not work:

com.fasterxml.jackson.databind.JsonMappingException: Multiple Record and/or Map types, can not figure out which to use for: [{"x":"y"}]

    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._wrapAsIOE(DefaultSerializerProvider.java:509)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:482)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:319)
    at com.fasterxml.jackson.databind.ObjectWriter$Prefetch.serialize(ObjectWriter.java:1396)
    at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:1120)
    at com.fasterxml.jackson.databind.ObjectWriter.writeValueAsBytes(ObjectWriter.java:1017)

I believe that both cases are valid and should not cause errors, or maybe I'm missing something.

cowtowncoder commented 5 years ago

Thank you for reporting this -- I hope to evaluate and merge patch (#170) for inclusion in 2.10.

marcospassos commented 5 years ago

Rebased #176 for 2.10

cowtowncoder commented 5 years ago

@amorimjuliana @marcospassos Thank you for reporting the problem, contributing fix -- this should be resolved in 2.10.0.pr2 (released I hope by end of August)!

cowtowncoder commented 5 years ago

I did have to do minor tweaking to eliminate now unused method variants that do not take forValue (handling of null wrt schema, for the new test, introduced some edge condition... not sure I handled it right). But I assume things work reasonably well at any rate.

marcospassos commented 5 years ago

Thank you, @cowtowncoder!

maxdbn commented 1 year ago

@cowtowncoder Sorry for using this thread, but this is the closest issue I found to what I am having.

I'm trying to serialize a HashMap with a Union Schema of 2 totally different records. Even though one of them fits the map structure, I receive the same error as the OP: Exception in thread "main" com.fasterxml.jackson.databind.JsonMappingException: Multiple Record and/or Map types, can not figure out which to use for.

I'm using the same schema in Python and it works seamlessly. Am I doing something wrong?

Example of the map:

final Map<String, Object> obj = new HashMap<>();
obj.put("date", 19166);
obj.put("recipient", "xxx");
obj.put("member_id", "yyy");
obj.put("server", "zzz");

Jackson code block:

final ObjectWriter mapper = new AvroMapper().writer(new AvroSchema(schema));
return mapper.writeValueAsBytes(obj);

Schema:

[
    {
       "type":"record",
       "name":"type_a",
       "fields":[
          {
             "name":"date",
             "type":[
                "null",
                {
                   "type":"int",
                   "logicalType":"date"
                }
             ],
             "default":null
          },
          {
             "name":"recipient",
             "type":"string"
          },
          {
             "name":"member_id",
             "type":[
                "null",
                "string"
             ],
             "default":null
          },
          {
             "name":"server",
             "type":[
                "null",
                "string"
             ],
             "default":null
          },
...

I didn't post the whole schema because it is very long, but it is essentially an array with more records of different structures

cowtowncoder commented 1 year ago

As things are, this usage is not supported by Jackson Avro module: handling of Union types is difficult and what we would need here is to make Avro backend use Polymorphic Type handling of jackson-databind. So in theory it would be possible to support, but in practice it is quite difficult... since information here comes from Schema and not types themselves.

So for time being unfortunately this would be "unsupported capability of Avro".