FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
310 stars 133 forks source link

Update avro to 1.11.3 #511

Closed rafalh closed 1 month ago

rafalh commented 1 month ago

Update Apache avro library to 1.11.3. To make it compatible namespace generation for nested classes had to be changed. Now dots are used instead of dollar signs as indicator of nesting. AFAIK dollar sign is not an allowed character in Avro namespace according to the specification and this is why this change was implemented in avro library. Note that avro library generates namespaces not ending with dollar character since version 1.9, but it actually removes all dollar signs in classes with multiple levels of nesting since 1.11.

See: AVRO-2143 and AVRO-2757 Fixes #167

pjfanning commented 1 month ago

@rafalh Thanks - I ran the unit tests with your changes and the tests all pass.

@cowtowncoder the changes look small enough and if users want to use the older Avro releases supported by jackson-dataformat-avro 2.17 - they can just not upgrade.

If you run the tests with the avro version set back 1.8.2 but with the other code changes in this PR - you get Failures: 19, Errors: 97.

All tests pass if you use avro 1.9.2.

pjfanning commented 1 month ago
cowtowncoder commented 1 month ago

Ok so this would only be for Jackson 3.0 (master branch)? Is this intentional? I can see arguments both ways, but Jackson 3.0 release is still out somewhat -- while I still hope for before-end-of-2024 RC1 that's not guaranteed. But I am trying to finalize 2.18 within this (or possibly next week).

Depending on compatibility constraints -- I am not 100% sure I understand constraints wrt Avro schema handling. The important thing would be ability for endpoints where one uses older Jackson Avro module and another newer would still work correctly wrt wire format. Ideally also upgrade to newer version would work without hitch (binary and source compatible) but the wire format compatibility is what matters most.

rafalh commented 1 month ago

Ok so this would only be for Jackson 3.0 (master branch)? Is this intentional?

I wasn't sure what the target branch of the PR should be. I assumed that if the change is evaluated to be safe for back-porting, it will get cherry-picked to other branches, but if that's not your workflow I can switch to 2.18. I wasn't sure what will be the reception considering that some people was worried about schema compatibility.

Depending on compatibility constraints -- I am not 100% sure I understand constraints wrt Avro schema handling. The important thing would be ability for endpoints where one uses older Jackson Avro module and another newer would still work correctly wrt wire format. Ideally also upgrade to newer version would work without hitch (binary and source compatible) but the wire format compatibility is what matters most.

If I understand correctly over-the-wire compatibility should be preserved. Also working with external schema which uses namespace with dollar signs should work fine, because I didn't remove handling for it: https://github.com/FasterXML/jackson-dataformats-binary/pull/511/files#diff-6199d0c495f8d8ed14a8059be2c4607ca408f8a4733390bb78350fec5d8010a9R341

The only problem is when someone generates schema from POJO with nested classes in new Jackson and then tries to use that schema with old Jackson. I don't think it's a popular use-case. Schema producer can always stick to older version if needed. If this is acceptable I am all for merging it to 2.18. Alternative solution would be to add a new feature to AvroGenerator.Feature enum that would allow to generate backward-compatible schemas, e.g. DOLLAR_SIGN_IN_NAMESPACES. Let me know if you want this feature added.

pjfanning commented 1 month ago

@rafalh the package names in master branch were changed from the values used in the 2.18 branch. Could you do a new PR that implements this set of changes but target the 2.18 branch instead? Leave this PR open too because it is useful for merging to the master branch.

rafalh commented 1 month ago

@rafalh the package names in master branch were changed from the values used in the 2.18 branch. Could you do a new PR that implements this set of changes but target the 2.18 branch instead?

Done: #512

cowtowncoder commented 1 month ago

Yes, like @pjfanning mentioned, merging across 2.x and master won't work with cherry-picking (due to multiple refactorings, renamings mostly) and so almost always starting with 2.x then resolving renames, package imports when merging to master is the way to go. And in general starting with oldest branch to support, rolling forward (although cherry-pick does work usually across 2.x maintenance branches).

Thank you for creating 2.18 PR -- I hope to get that merged soon.

cowtowncoder commented 1 month ago

Was able to merge forward #512, closing this one.