FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
316 stars 136 forks source link

Unable to Serialize Guava Multimap in Avro #103

Closed ghost closed 7 years ago

ghost commented 7 years ago

Using jackson version 2.9.0

One of our classes utilizes Guava's Multimap. It is a Multimap<String, String>. When attempting to write our object in Avro, the following exception is thrown:

com.fasterxml.jackson.databind.JsonMappingException: Not an array schema: "string" (through reference chain: com.company.www.app.domain.model.OurObject["ourMultimap"])

    at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:391)
    at com.fasterxml.jackson.databind.JsonMappingException.wrapWithPath(JsonMappingException.java:351)
    at com.fasterxml.jackson.databind.ser.std.StdSerializer.wrapAndThrow(StdSerializer.java:316)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:727)
    at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    at com.fasterxml.jackson.databind.ser.BeanSerializer.serialize(BeanSerializer.java:155)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider._serialize(DefaultSerializerProvider.java:480)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:319)
    at com.fasterxml.jackson.databind.ObjectWriter$Prefetch.serialize(ObjectWriter.java:1396)
    at com.fasterxml.jackson.databind.ObjectWriter._configAndWriteValue(ObjectWriter.java:1120)
    at com.fasterxml.jackson.databind.ObjectWriter.writeValueAsBytes(ObjectWriter.java:1017)
    at com.company.www.app.domain.model.OurObjectMarshallerTest.testAvroSerialization(OurObjectMarshallerTest.java:277)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
    at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:234)
    at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:74)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)
Caused by: org.apache.avro.AvroRuntimeException: Not an array schema: "string"
    at org.apache.avro.generic.GenericData$Array.<init>(GenericData.java:241)
    at com.fasterxml.jackson.dataformat.avro.ser.AvroWriteContext._createArray(AvroWriteContext.java:184)
    at com.fasterxml.jackson.dataformat.avro.ser.MapWriteContext.createChildArrayContext(MapWriteContext.java:40)
    at com.fasterxml.jackson.dataformat.avro.AvroGenerator.writeStartArray(AvroGenerator.java:364)
    at com.fasterxml.jackson.datatype.guava.ser.MultimapSerializer.serializeFields(MultimapSerializer.java:327)
    at com.fasterxml.jackson.datatype.guava.ser.MultimapSerializer.serialize(MultimapSerializer.java:279)
    at com.fasterxml.jackson.datatype.guava.ser.MultimapSerializer.serialize(MultimapSerializer.java:39)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.serializeAsField(BeanPropertyWriter.java:727)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.serializeFields(BeanSerializerBase.java:719)
    ... 40 more

This is how we are are setting up the writer:

final AvroSchemaGenerator generator = new AvroSchemaGenerator();
final ObjectMapper avroMapper = new AvroMapper()
        .configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
        .registerModule(new GuavaModule())
        .registerModule(new AvroModule());
avroMapper.acceptJsonFormatVisitor(OurObject.class, generator);
final AvroSchema schema = generator.getGeneratedSchema();
avroWriter = avroMapper.writer(schema);
...
final byte[] avro = avroWriter.writeValueAsBytes(ourObject);

And the generated Avro schema looks something like:

{
  "type":"record",
  "name":"OurObject",
  "namespace":"com.company.www.app.domain.model",
  "fields":[
    {
      "name":"ourMultimap",
      "type":[
        "null",
        {
          "type":"map",
          "values":"string",
          "java-class":"com.google.common.collect.Multimap"
        }
      ]
    }
  ]
}

And the class would look something like:

public class OurObject {
    public Multimap<String, String> ourMultimap;

    public Multimap<String, String> getOurMultimap() {
        return ourMultimap;
    }

    public void setOurMultimap(Multimap<String, String> ourMultimap) {
        this.ourMultimap = ourMultimap;
    }

    public Collection<String> getOurMultimapValue(String ourMultimapKey) {
        return this.ourMultimap.get(ourMultimapKey);
    }
}

It would appear that these two libraries (jackson-dataformat-avro & jackson-datatype-guava), or more specifically, AvroSchemaGenerator & GuavaModule, don't play nice together. Unless, I'm totally messing things up.

cowtowncoder commented 7 years ago

Thank you for reporting this. I suspect problem comes from Guava module declaring content model that differs from actual serialization: that is, it claims to serialize as simple string, but that can not be: it produces something like Map<String,String[]>. If so it'd be Guava problem -- but I'll first look into this to see if that is true.

cowtowncoder commented 7 years ago

Doh! At this thought serializer implemented discovery properly, but just realized that code SHOULD differ from standard MapSerializer to account for that intermediate Array node...

cowtowncoder commented 7 years ago

Reproducing the issue can be done easily enough (although due to deps, need to stash in jackson-compat-minor or such). But fixing gets bit involved because there's no good way to indicate wrapping as is: rather, and intermediate JsonSerializer (or its equivalent) is needed.

cowtowncoder commented 7 years ago

Done -- this will be fixed via

https://github.com/FasterXML/jackson-datatypes-collections/issues/19

which will be included in 2.8.10 and 2.9.1 once those are released. In the meantime may want to use snapshot builds or local builds.