FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
316 stars 136 forks source link

[Avro] Add support for @AvroEncode annotation #69

Closed baharclerode closed 7 years ago

baharclerode commented 7 years ago

This enables support for the @AvroEncode annotation, which is the avro equivalent of a JsonSerializer/JsonDeserializer.

The serialization side of things was fairly straight-forward, since we can write directly to the underlying datum without worrying about parser state; Deserialization is a bit more tricky, since the parser inserts virtual tokens (start/end object, field names on records, etc.) that we need to consume those and otherwise ensure the parser state is updated so that deserialization control can be handed back to Jackson once the CustomEncoding has done its job.

This is the last major piece of the puzzle that brings compatibility with the apache implementation up to an acceptable level for my use cases, so hoping it'll make the cut for 2.9 😃


I also re-enabled the tests that were ignored in 2.8 due to namespace changes, since the underlying code changes were themselves reverted when merged into master a while back.

cowtowncoder commented 7 years ago

Ok hmmh. This is quite elaborate, and I'll have to think carefully... since the problem is that I would ideally want to reduce or eventually even eliminate dependencies to the standard avro lib, wrt encoding/decoding. If this is only sort of isolated area and functionality only affects annotated types (which I think is the case), it may still be ok. And I am +1 on increased interoperability per se. I just don't want to add coupling to implementation; for example, to improve serialization speed I would eventually want to implement encoding directly.

I'll have to read over the code a bit more; I like many of the aspects (and we can probalby simplify some parts as per comments I added).

baharclerode commented 7 years ago

You won't be able to completely remove the encoding dependencies from the avro lib and support this annotation - The annotation specifically requires a subtype of CustomEncoding, which references both the Decoder and Encoder abstract classes; They are, however, effectively interfaces with a few default methods, which means that it will still be completely decoupled from the implementation. (CustomEncoding also references Schema, which can't be easily replaced though)

For example, the deserialization should already be decoupled, since that is handled by a wrapper that extends Decoder and wraps the AvroParserImpl using the existing methods to ensure that the parser state is advanced correctly. I don't think it even calls any of the decoder primitives that AvroParserImpl exposes.

Serialization would probably require a similar pattern, but as long as the underlying implementation supports the same Encoder primitives, the NonBSGenericDatumWriter can be swapped out for any other (faster) implementation that knows how to pass itself or a wrapper into EncodedDatum.write() method.

cowtowncoder commented 7 years ago

Agreed: I am ok with using annotations as well as simple interfaces (Decoder, Encoder) for this part. I am only worried about keeping those isolated, so that not everything has to implement these abstractions. And if at some future point it seemed useful we could further modularize things to isolate even these aspects -- but that might never happen.

But note that my explicit goal (hoping to still get done for 2.9) is to remove actual usage of BinaryDecoder for standard decoding. I don't think this would interfere with that goal, I just want to mention it just in case I am overlooking something.

baharclerode commented 7 years ago

Your fix for _isExplicitClassOrOb worked perfectly, and I've updated the PR to remove the AvroHandlerInstantiator.

cowtowncoder commented 7 years ago

@baharclerode Looks like there was yet another smaller issue, now fixed too:

https://github.com/FasterXML/jackson-databind/issues/1584

but I don't think it'd affect your changes here.