FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
316 stars 136 forks source link

Incompatibility with Avro >=1.9.0 (upgrade to Avro 1.11.3) #167

Closed Sage-Pierce closed 2 months ago

Sage-Pierce commented 5 years ago

Avro version 1.9.0 was released last month (https://mvnrepository.com/artifact/org.apache.avro/avro) and my team has run in to compatibility issues between jackson-dataformat-avro and this latest release.

In particular, it looks to be the case that Avro has switched from Codehaus Jackson to FasterXML Jackson, which results in NoSuchMethodError thrown when accessing utility methods.

The following test exposes the issue:

    @Test
    public void beanShouldBeSerializableAsAvroBinaryAndDeserializable() throws Exception {
        TestBean testBean = new TestBean();
        testBean.setData("Data");

        AvroMapper avroMapper = new AvroMapper();

        AvroSchema avroSchema = avroMapper.schemaFor(TestBean.class);

        ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
        avroMapper.writer().with(avroSchema).writeValue(outputStream, testBean);

        byte[] serialized = outputStream.toByteArray();

        assertNotNull(serialized);
        assertTrue(serialized.length > 0);

        TestBean deserialized = avroMapper.readerFor(TestBean.class).with(avroSchema).readValue(serialized);

        assertEquals(testBean, deserialized);
    }

    public static final class TestBean {

        private String data;

        @Override
        public boolean equals(Object o) {
            if (this == o)
                return true;
            if (o == null || getClass() != o.getClass())
                return false;
            TestBean testBean = (TestBean) o;
            return Objects.equals(data, testBean.data);
        }

        @Override
        public int hashCode() {
            return Objects.hash(data);
        }

        public String getData() {
            return data;
        }

        public void setData(String data) {
            this.data = data;
        }
    }

This test passes on v1.8.2 of Avro, but fails with v1.9.0 on the following:

java.lang.NoSuchMethodError: org.apache.avro.util.internal.JacksonUtils.toObject(Lorg/codehaus/jackson/JsonNode;)Ljava/lang/Object;

    at com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor.schemaFieldForWriter(RecordVisitor.java:192)
    at com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor.optionalProperty(RecordVisitor.java:121)
    at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.depositSchemaProperty(BeanPropertyWriter.java:839)
    at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.acceptJsonFormatVisitor(BeanSerializerBase.java:861)
    at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.acceptJsonFormatVisitor(DefaultSerializerProvider.java:566)
    at com.fasterxml.jackson.databind.ObjectMapper.acceptJsonFormatVisitor(ObjectMapper.java:3874)
    at com.fasterxml.jackson.databind.ObjectMapper.acceptJsonFormatVisitor(ObjectMapper.java:3853)
    at com.fasterxml.jackson.dataformat.avro.AvroMapper.schemaFor(AvroMapper.java:96)

Note that we are using com.fasterxml.jackson.dataformat:jackson-dataformat-avro:2.9.9

cowtowncoder commented 5 years ago

Hmmh. So the problem is that API of Apache Avro library is incompatible between 1.8.x and 1.9 -- one exposes Jackson 1.x JsonNode (1.8 and before), latter Jackson 2.x JsonNode. This is major incompatibility and something that Jackson Avro module probably can do nothing about, in the sense that if you use Jackson Avro Module 2.9 or earlier, you MUST use Apache Avro module version 1.8.x. I can change this for Jackson 2.10 to upgrade to require 1.9.0 (with similar limitation that earlier versions can not be used).

Thank you for reporting the issue: I think it is good that apache avro library has upgraded to Jackson 2.x, but it is bit problematic as types are exposed in API on short term.

salt-tony commented 5 years ago

As of Apache Avro 1.9.0 they updated to depend on Jackson 2.9.7, but the problem is actually in the com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor class which is depending on "org.apache.avro.util.internal.JacksonUtils", which used to reference the old org.codehaus.jackson.JsonNode. The problem is the Jackson RecordVisitor class should not be relying on an 'internal' API from org.apache.avro.util.internal, and certainly should not be referring to classes from old versions of Jackson.

cowtowncoder commented 5 years ago

@salt-tony yes, I understand the mechanism pretty well and attempted to summarize above. But I am not sure of the point you are trying to make here?

My interest is not in figuring out what to blame (I agree that relying on internal parts of another library is not a good idea) but to figure out how to try to contain further version incompatibilities. Since Jackson versions up to 2.9.x will be bound to dependencies of Avro lib 1.8.x, it is not possible to upgrade to later Avro library there; and anything using Jackson 2.9.x Avro module can not use apache Avro 1.9.x or above. But there is further problem that if with 2.10 we upgrade dependency here, it will then require upgrade of apache avro library for any mixed use cases. Alternatively it would be possible keep status quo for Jackson 2.x and not upgrade to apache avro 1.9.x and beyond; and wait for Jackson 3.0 to make the jump.

Note, too, that Avro module is only relying on Jackson 1.x types because Avro 1.8.x exposes default values using 1.x JsonNode values; and internal JacksonUtils class seems to have been referenced for necessary conversions (I did not write code in question and failed to notice reference on code review); but even without it, reference to 1.x types is unavoidable before new version.

cowtowncoder commented 5 years ago

After investigating this for a bit now I am much less hopeful in figuring out a working way to upgrade Avro module to work with Apache Avro lib 1.9.x. Part of the problem is that the complicated way some of unit tests work, it is difficult to see what actually breaks: I can get code to compile fine, but many things do not seem to work the way they used to in avro lib.

So there is a distinct possibility that Avro module will require pre-1.9 version of Apache Avro lib, at least for 2.10.

baharclerode commented 5 years ago

Ohhhh man Avro has gone and created a right nasty mess. I've done a bit of digging, and there's some good news and some bad news:

  1. Good: jackson-dataformat-avro can be made binary compatible with both 1.8.X and 1.9.X without too much work. We were leveraging some Avro helper utils to avoid duplicating the code and increase the likelihood of remaining api-compatible as things evolved. These just need to be replaced with fresh code that lives in jackson-dataformat-avro. Slightly increases the risk of api-compatibility (specifically around how default values are encoded and interpreted in schemas), but all of the regression tests still pass with 1.8.X and I expect the use-cases to be limited to a domain of types that shouldn't have compatability issues.
  2. Bad: Avro changed the way they encode type information in schemas. In 1.8.X, nested classes were encoded as

    {"type":"record","namespace":"package.OuterClass$","name":"InnerClass", ....}

    Which was really helpful, because you could tell the difference between a nested class and a top-level class, and that made runtime type discovery relatively easy because these could be unambiguously concatenated and then fed into Class.forName() to find the type during deserialization. However, with 1.9.X, they've started serializing nested classes sans the trailing $ in the namespace:

    {"type":"record","namespace":"package.OuterClass","name":"InnerClass", ....}

    Which we can no longer simply feed into Class.forName(), and now have to do a "Guess and check" by probing both the non-nested and nested combinations of namespace+name. This makes 1.9.X not forwards compatible with 1.8.X over the wire.

The unit tests catch this pretty handily (all of the "Apache(1.9.X) -> Jackson w/ Apache(1.9.X) schema" round-trip tests start failing, while their counterparts do not), and once fixed reveal a handful of other edge cases that need to be looked at. It appears that most of the other failing unit tests are similarly related to this issue, in the forms of Enums not resolving correctly or Unions not resolving correctly because we had relied upon Avro helpers to resolve these situations in a compatible manner; It looks like they will have to be rewritten and kept in sync instead.

In general, I recommend staying away from 1.9.X unless you're prepared to upgrade your whole ecosystem (or at least every consumer) all at once because the schema format is not forwards compatible. I think these can all be worked around/fixed/patched so that Jackson can read from both versions and write from both versions without any major overhauls/work, but we do need to clone&own some pieces we were leveraging from Avro.

cowtowncoder commented 5 years ago

@baharclerode Thank you for digging bit deeper! Your comments make more sense to me, and I think that at least for 2.10 we'll better stay on 1.8.x. Even if it means we can't get rid of Jackson 1.x dependency.

I would like to see the minor cleanup wrt replacing use of internal.JacksonUtils, as that makes needed changes slightly simpler.

baharclerode commented 5 years ago

I'll see if I can get two PRs together:

  1. Remove direct dependency on Jackson 1.X / internal.JacksonUtils
  2. Support deserializing from Apache 1.9.X schemas

The irony is I'm pretty sure this behavior was changed because the $ in namespaces isn't allowed by the spec, but the java implementation used it accidentally and it was never caught because until now, namespaces were not validated by the java implementation (AVRO-2143); However, the java implementation will still generate invalid namespaces if you doubly-nest a class, i.e.:

package example;
public class Foo {
    public static class Bar {
        public static class Baz {}
    }
}

will still have a name of Baz and a namespace of example.Foo$Bar since they only fixed it for one level deep and didn't handle anything beyond that.

cowtowncoder commented 5 years ago

Heh. Oh boy. But yes, +1 for those PRs. I'll also need to find time to work on other submitted PRs, bugs against Avro module, to get things shipshape for 2.10.

dani-e-l commented 5 years ago

Hi cowtowncoder are you working on this?

Avro 1.8.x has vulnerabilities and we need to update avro or remove jackson-dataformats-binary from our projects. Is there any possibility to support 1.9.x in short term?

cowtowncoder commented 5 years ago

No, I do not plan to work on this. Avro 1.9.x seems fundamentally problematic wrt its dependencies.

ZachManno commented 5 years ago

@DanielCharczynski We fixed this vulnerability by using tech.allegro.schema.json2avro converter version 0.2.9

Fokko commented 4 years ago

Could you check if this is still the case with Apache Avro 1.9.2? Would love to help here.

RyanSkraba commented 4 years ago

Hello! A couple of quick points:

  1. Name validation can be turned off in Schema.Parser. It's probably not great to use non-standard names across platforms/languages, but it's possible.
  2. The schema can store other metadata. The java-class attribute is probably a good bet to store the Java class name!
  3. There shouldn't be any problem with forward/backward compatibility strictly "over-the-wire" (I sincerely hope!) Binary serialization doesn't include name information at all. However, it's probably possible for the 1.8.x Java implementation to create an Avro file with a "bad" schema, or store it in a registry, etc.

+1 to helping get this working for jackson!

Sage-Pierce commented 4 years ago

@Fokko I can confirm this is still an issue with Avro 1.9.2 and Jackson versions 2.9.8, 2.9.10, and 2.10.2

cowtowncoder commented 4 years ago

@Sage-Pierce thank you for verifying this.

baharclerode commented 4 years ago
  1. The schema can store other metadata. The java-class attribute is probably a good bet to store the Java class name!

It would be great! ... Except the reference implementation doesn't do this for record, enum, or fixed types. There's a lot of tooling and infra that depends heavily upon the reference implementation of Avro, so breaking compatibility with it is a dangerous game. IMHO the better solution would have been for the reference implementation to switch to preferring java-class if it exists, and always including java-class (even for record/enum/fixed types) whenever the java type isn't $namespace.$className. Honestly that'd probably be a good direction to take Jackson-Avro as it gives us more options on how to handle data correctly in a mixed-version ecosystem.

  1. There shouldn't be any problem with forward/backward compatibility strictly "over-the-wire" (I sincerely hope!) Binary serialization doesn't include name information at all. However, it's probably possible for the 1.8.x Java implementation to create an Avro file with a "bad" schema, or store it in a registry, etc.

The binary side of things are indeed fully compatible, but the binary can't be read without a schema, so I consider the schema part of compatibility. If a 1.8 client can't read 1.9 data written with a 1.9 schema and map it correctly, backwards compatibility was broken. Even though forward compatibility was not broken, there are some relatively simple/common usage patterns that make upgrading a herculean effort. (Essentially, all consumers have to be upgraded first, and only then you can upgrade the producers. But what do you do for an application that is both a consumer and producer? load different versions of the library on different classloaders?)

cowtowncoder commented 4 years ago

Hmmh. I must say that I find possible incompatibility between 1.8 and 1.9 tooling deeply troubling, especially as binary format should itself be compatible. And even from basic version standpoint it is... not great.

While I do not necessarily understand full complexity here, it seems to me that it may be necessary to sort of work Jackson Avro format module into 2 implementations; existing older one for 1.8 and prior, and then something else for 1.9 and beyond? Naming of this thing is probably ... interesting problem, but both could co-exist and aside from question of what nominal "format name" (property that Jackson format implementations declare they support), might work. Obviously maintaining 2 separate backends is not ideal at all, but at least it would give users the possibility of selecting one that works better -- or, possibly, with enough work, ability to dynamically select one or the other?

I assume new module would be something like jackson-dataformat-avro19 / Avro19Module?

baharclerode commented 4 years ago

Ultimately, I don’t think we need to maintain two different backends. Jackson is flexible enough that we can upgrade to avro 1.9, fix the deserialization/serialization code to handle 1.8 or 1.9 schemas automatically, and add a mapper feature to select if you want 1.8 or 1.9 schemas to be generated (defaulting to 1.8 for compatibility). Switching an existing ecosystem from 1.8 to 1.9 will still be painful, but Jackson won’t be the bottleneck/blocker.

I’m mostly just frustrated at the reference implementation because unlike Jackson, I can’t have full 1.8 and 1.9 support side-by-side in the same app without shading or classloader magic.

Fokko commented 4 years ago

To give some background, Avro 1.8.x was still on the old codehaus Jackson. Avro 1.8.2 was released back in June 2017, quite a while ago. Since then we've updated a lot of outdated dependencies, including moving from Jackson 1.x to 2.x and deprecating Joda-time and such. Since back then there we're some Jackson objects part of the public API, we've decided to remove these since we had to break the API anyway (sad, I know).

We've analyzed the API's that we've broke, this is mostly Jackson and Joda stuff: http://people.apache.org/~busbey/avro/1.9.0-RC4/1.8.2_to_1.9.0RC4_compat_report.html The Joda can still be enabled by setting an option, but we didn't want to ship Avro with the old version of Jackson anymore.

cowtowncoder commented 4 years ago

@Fokko removal of Jackson 1.x (and especially removal of the exposure via Avro API) is great, and no complaints there. Replacements are (as far as I understand) fine and not problematic in themselves. But it seems that some changes outside this particular dependency are bigger issues

@baharclerode if that can be done, great. I wonder if this would fit in timeframe for 2.11 (quite imminent), or 2.12? Or, put another way: are there things to be done in 2.11, with limited compatibility problems, but that would lead towards breaking changes in 2.12? I would be happy to help coordinate preparation changes (there are PRs, I know).

OneCricketeer commented 4 years ago

🍿 😮 🍿

Are there Avro JIRAs to file/watch for complaining about the "minor" semantic number introducing backwards incompatible changes?

RyanSkraba commented 4 years ago

For info: @cricket007 https://issues.apache.org/jira/browse/AVRO-2687 and a link to the last discussion on the mailing list. Avro has not followed literal semantic versioning, which is often unexpected. Your (tactful and gentle) perspective would be very welcome :smile:

@baharclerode : I do agree with you about the schema being an important part for backwards/forwards compatibility. The current behaviour was definitely a fix for cross-platform compatibility in 1.9.x... it's unfortunate nobody foresaw code relying on the invalid names from 1.8.x!

I think I had a basic misunderstanding on what Jackson-Avro needed to work -- I'm taking a deeper look.

Avro is planning a 1.10.x release in May (which should not be considered a minor semantic version bump). It might be worthwhile working together and targetting that to have the least-painful / most-correct fixes in both projects -- maybe we can restore the "better solution" :point_up: behaviour for ReflectData at that point?

iemejia commented 4 years ago

@Fokko removal of Jackson 1.x (and especially removal of the exposure via Avro API) is great, and no complaints there. Replacements are (as far as I understand) fine and not problematic in themselves. But it seems that some changes outside this particular dependency are bigger issues

Which (other) changes do you refer to concretely @cowtowncoder ?

cowtowncoder commented 4 years ago

@iemejia I wish I remembered the exact details, from my attempts to upgrade, but I think there were actual behavior changes between apache avro 1.8 and 1.9, so that although I was able to make Jackson Avro module compile with changes to use new methods in 1.9, existing test suite failures, possibly related to changes in schema name generation changes. I think @baharclerode is a better source for concrete details at this point. I did create branch exp/avro-1.9-for-2.10 for my upgrade attempt, so I can reproduce the test fails.

As to Avro 1.10.x, it sounds like this might be good opportunity to resolve the upgrade challenges. I am trying to finalize Jackson 2.11 soon; maybe we can do some of the fixes for that -- like at least remove use of avro lib's internal JacksonUtils. But even beyond that I hope to reduce time between 2.x minor versions so that anything fixed for 2.12 would not take as long as 2.10 did.

@baharclerode I'll have a look at that PR now.

cowtowncoder commented 4 years ago

First patch merged, so now module does compile against avro lib 1.8.2 and 1.9.2, and in latter case does not pull in Jackson 1.x any more. There would be those interoperability test failures, but one problem at a time.

iemejia commented 4 years ago

Great to see this progressing. If we have further issues please don't hesitate to contact us, create JIRAs etc. As you definitely should know it is sometimes hard to track every use of our API but we are eager to not break our users or provide alternatives if so.

For future breakage awareness we are going to remove Joda Time in Avro 1.10 so please use the Java Time APIs if possible to ease future migration.

baharclerode commented 4 years ago

maybe we can restore the "better solution" ☝️ behaviour for ReflectData at that point?

@RyanSkraba It's all a bit moot until my organization has a migration path off of 1.8, which will involve running a mixed ecosystem of 1.8.x and 1.9.x both as producers and consumers concurrently. Until we can figure out that, I'm effectively stuck on 1.8. We've got live distributed systems spanning multiple teams that we can't take down for atomic upgrades.

As you definitely should know it is sometimes hard to track every use of our API but we are eager to not break our users or provide alternatives if so.

@iemejia Yeah, I get that. And thanks for the callout on Joda; I don't think Jackson-Avro has any hard references to that, but I've not looked to see if the API was transitively exposed anywhere by virtue of us relying on type hints embedded in the schemas.

Which (other) changes do you refer to concretely @cowtowncoder ?

I believe @cowtowncoder is referring to the schema name -> class name resolution; These don't involve the Jackson 1.x dependency, but break 1.8.x implementations from using schemas generated by the 1.9 reference implementation when using reflection to map schema types to concrete types in scenarios involving nested classes.

Essentially, I would propose the following in the Avro reference implementation:

  1. We need a patch release for 1.8.x that allows it to handle 1.9.x schemas reflected from nested POJOs that were previously working with 1.8.x; This will probably involve patching SpecificData.getClass(Schema) to properly find nested classes when the schema both has $ and doesn't have $ in the namespace; as well as patching GenericData.resolveUnion(Schema,Object) to correctly resolve the union branch for instances of nested classes regardless of whether the schema has a trailing $ in the namespace or not.

  2. We need a patch release for 1.9.x that allows it to handle 1.8.x schemas reflected from nested POJOs; This will probably involve patching GenericData.resolveUnion(Schema,Object) to correctly resolve the union branch for instances of nested classes regardless of whether the schema has a trailing $ in the namespace or not.

  3. We need a patch release for 1.9.x that makes the use of java-class pervasive. Update ReflectData.createSchema(Type,Map) to always generate this property for records, enums, and fixed schemas whenever Class.forName(mangledNamespace + "." + mangledSimpleName) doesn't map to the concrete class; Update SpecificData.getClass(Schema) to utilize java-class even for maps, records, enums, and fixed schemas if it exists.

This will accomplish two things:

  1. 1 and 2 provide an easy upgrade path from 1.8.x to 1.9.x by more fully supporting a mixed ecosystem.
  2. 3 prepares to fix the remaining namespace allowed-characters bug in 1.10.x

In 1.10.x, we can blindly mangle the namespaces and record names however we want, since older 1.9.x clients will get the correct class name from the java-class property.

(We will need to make corresponding tweaks in Jackson-Avro, which is the second of the two PRs I described in my earlier comments in this thread, but have not written yet).

cowtowncoder commented 4 years ago

@baharclerode If I understand above correctly, it sounds like...

  1. Jackson avro 2.11 does compile, run against apache avro 1.8 and 1.9, with fixes, but behavior varies wrt schema handling, failing some tests against 1.9. This allows some users to possibly upgrade to 1.9 if they do not need backwards compatibility; but this requires explicit override of version (and we do not want to increase default dep version yet due to compatibility problems)
  2. Further improvements to Jackson Avro format module now depend on changes to apache avro library, as this module uses it for most of schema handling.

If accurate, I think I can proceed with Jackson 2.11.0 at this point -- it is not delayed by Avro module, there have been other issues, but there is some urgency to getting some of improvements, fixes out -- and baseline increase wrt apache avro lib could go in 2.12.0 if issues can be resolved.

conniey commented 4 years ago

I've run across the same vuln warnings that 1.8.2 has and noticed that avro 1.10.0 is released. Is it worthwhile to test this new minor versions to see if the issues are resolved?

Sage-Pierce commented 4 years ago

FWIW, @baharclerode the issue you brought up about nested class name deserialization breaking with Avro [1.9.0, 1.10.0) may have been fixed in ~1.10.0~ (actually, looks like it may not have gotten in to 1.10.0) by this PR

Roma7-7-7 commented 4 years ago

+1. Got same issue and so far the only workaround seems to be downgrade to avro:1.8.x

fmiguelez commented 2 years ago

Issue still happening with latest jackson (2.13.4) and Avro (1.10.2):

java.lang.NoSuchMethodError: 'com.fasterxml.jackson.core.io.ContentReference com.fasterxml.jackson.dataformat.avro.AvroFactory._createContentReference(java.lang.Object, int, int)'
        at com.fasterxml.jackson.dataformat.avro.AvroFactory.createParser(AvroFactory.java:364) ~[iCeU_m3Pa5bKgb_PKMulNQ/:?]
        at com.fasterxml.jackson.dataformat.avro.AvroFactory.createParser(AvroFactory.java:358) ~[iCeU_m3Pa5bKgb_PKMulNQ/:?]
        at com.fasterxml.jackson.dataformat.avro.AvroFactory.createParser(AvroFactory.java:19) ~[iCeU_m3Pa5bKgb_PKMulNQ/:?]
        at com.fasterxml.jackson.databind.ObjectReader.createParser(ObjectReader.java:1074) ~[java-instance.jar:?]
        at com.fasterxml.jackson.databind.ObjectReader.readTree(ObjectReader.java:1782) ~[java-instance.jar:?]
cowtowncoder commented 2 years ago

@fmiguelez Yes, there is no solution to this since I don't think anyone has had time to look into this lately.

It looks like there's version 1.11.x of Apache Avro (see https://mvnrepository.com/artifact/org.apache.avro/avro) which might (but might not) help. But someone would need to dig into issue for anything to happen here; I don't have time right now, unfortunately.

/cc @baharclerode

fmiguelez commented 2 years ago

We have rolled back to version 2.12.6.1 (provided by bom 2.12.6.20220326) and posted error does not occur. It works with Avro 1.10.2.

cowtowncoder commented 2 years ago

@fmiguelez I don't think your problem is same as one reported here: it seems more like your version of jackson-core is older than that of jackson-dataformat-avro -- these versions MUST match so that jackson-core has same or higher minor version as jackson-dataformat-avro.

But I do notice that trying to upgrade version to build this module, there are test failures for Apache Avro library versions 1.9.x, 1.10.x and 1.11.x. So you are right in that this module does not work with newer Apache Avro libraries; but you also seem to have another issue unrelated to this one.

This is something someone would need to figure out: I don't think newer versions of Apache Avro will change in a way to make Jackson Avro module work as-is. There is some work related to class name handling as per earlier discussion here.

cowtowncoder commented 1 year ago

Looks like version 1.11.x released:

https://mvnrepository.com/artifact/org.apache.avro/avro

if anyone had time and interest to see whether we could possibly upgrade past 1.8.x for Jackson 2.16.

/cc @fmiguelez @baharclerode

pjfanning commented 1 year ago

There is a CVE in avro up to 1.11.3 - see

https://lists.apache.org/thread/wcj1747hvyl7qjhrfr6d6j1l62hvpr5l https://www.cve.org/CVERecord?id=CVE-2023-39410 https://issues.apache.org/jira/browse/AVRO-3819

401 shows the test issues.

apupier commented 10 months ago

Same number of test error with Avro 1.9.2 and 1.11.3: Tests run: 1442, Failures: 8, Errors: 14, Skipped: 50

apupier commented 10 months ago

The test com.fasterxml.jackson.dataformat.avro.SerializeGeneratedTest.testWriteGeneratedEvent() is failing with:

SerializeGeneratedTest.testWriteGeneratedEvent:18 » JsonMapping No field named 'specificData' (through reference chain: com.fasterxml.jackson.dataformat.avro.gen.Event35["specificData"])

I noticed one related change in Avro API which is related. The getSpecificData has been added to the API of SpecificRecordBase in 1.9.0 https://avro.apache.org/docs/1.9.0/api/java/org/apache/avro/specific/SpecificRecordBase.html#getSpecificData-- https://avro.apache.org/docs/1.8.0/api/java/org/apache/avro/specific/SpecificRecordBase.html

Consequently, the specificData field mentioned in the test error is added in com.fasterxml.jackson.databind.introspect.AnnotatedMethodCollector.collect(TypeFactory, TypeResolutionContext, JavaType, List, Class<?>)

bgalek commented 3 months ago

@apupier I've stumbled on the same problem with specificData any ideas for better workaround?

avroMapper.registerModule(new SimpleModule() {
            @Override
            public void setupModule(SetupContext context) {
                context.addBeanSerializerModifier(new BeanSerializerModifier() {
                    @Override
                    public List<BeanPropertyWriter> changeProperties(
                        SerializationConfig config,
                        BeanDescription beanDesc,
                        List<BeanPropertyWriter> beanProperties
                    ) {
                        for (int i = 0; i < beanProperties.size(); i++) {
                            BeanPropertyWriter writer = beanProperties.get(i);
                            if (writer.getName().equals("specificData")) {
                                beanProperties.remove(writer);
                            }
                        }
                        return super.changeProperties(config, beanDesc, beanProperties);
                    }
                });
                super.setupModule(context);
            }
        })
pjfanning commented 3 months ago

@bgalek Your change helps a bit but ultimately only fixes 1 of 21 broken tests in this Avro module. See #508

bgalek commented 3 months ago

@pjfanning sure, do you know any ETA for the overall fix? Can I somehow help?

pjfanning commented 3 months ago

No ETA - in fact, it seems unlikely anything will be done here. It has been open for a long time. 21 broken tests with many root causes. Would you not consider using Avro directly. It has its own capabilities for serializing and deserializing data for Java classes.

bgalek commented 3 months ago

Yeah, I think it's ok. I think we could close this issue with a bit explanation and your suggestion to use avro directly ;)

cowtowncoder commented 3 months ago

I am not sure I'd recommend "use something else" as the blanket answer. But it is true that right now there is no one working on resolving this issue. We would be in better position to get upgrade to Jackson 3.0 however, for what that is worth. But the compatibility issue still needs to be resolved.

Thanks to your patch (via @pjfanning 's PR) there is at least one less unit test failure with 2.18. Small step but still... :)

Put another way: solution here would be highly valuable but Help Needed. I can help with small details in getting things reviewed, merged and so on.

rafalh commented 3 months ago

I've made a PR (#511) that successfully passes all tests with Avro 1.11.3, but I'm unsure about backward compatibility. If I understand correctly there is no problem with over-the-wire (binary) compatibility, but the problem exists if someone generates a schema in the new Jackson and then gives it to a consumer using an old Jackson version? Is it still a relevant issue in 2024? Let me know what's the best way to handle it.

cowtowncoder commented 2 months ago

As per my comment on PR, what I think matters most is the over-the-wire compatibility. I don't think generation with newer version, trying to use with older would be big concern. It'd be great if there was a way to do some sanity checking wrt tests (like Jackson 2.17.2/avro communicating with modified new version) but not sure how that could be done with moderate effort.

I also think we would probably want PR against 2.18 branch instead of master (needs to be merged there, but that's for 3.0.0 and 2.18.0 will be released much sooner).