FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
310 stars 133 forks source link

Add support for generating IonSexps #242

Closed jhhladky closed 3 years ago

jhhladky commented 3 years ago

Add writeStartSexp and writeEndSexp methods to the IonGenerator. This requires extending the JsonWriteContext into a new class that recognizes the sexp context.

cowtowncoder commented 3 years ago

Ok, ideally I think this change would go in 2.13 since it changes types a bit... but I assume this is for a bug fix that'd be good to get out sooner. As usual I think I'd want @jobarr-amzn (or one of other developers with knowledge) to add their ok as well. About the only suggestion I have is that in addition to a fail case (in which it might make sense to check that exception message has message expected), it'd be good to have a passing case that exercises the new code path. Some existing tests probably do that already, but something explicit is nice to guard against regressions.

I will probably need some help merging this to master for 3.0: change in handling of read/write contexts will make handling easier (GeneratorBase does not create it), but requires some changes to types used. If so I'll add a note after merging.

cowtowncoder commented 3 years ago

@mcliedtke On JsonFormat.Shape: that is an interesting challenge -- Shapes are format-agnostic, and adding new entries is not necessarily easy. There is Shape.NATURAL which sometimes could work (for "intrinsic" representation of something). 3.0 adds POJO to support logical concept of key/value pairs, but that is probably not relevant.

mcliedtke commented 3 years ago

@mcliedtke On JsonFormat.Shape: that is an interesting challenge -- Shapes are format-agnostic, and adding new entries is not necessarily easy. There is Shape.NATURAL which sometimes could work (for "intrinsic" representation of something). 3.0 adds POJO to support logical concept of key/value pairs, but that is probably not relevant.

Yeah, this is maybe something peculiar about Ion that it has both list and sexp which are very similar ordered-list concepts. I don't think NATURAL or POJO quite capture the idea well.

jhhladky commented 3 years ago

Ok, ideally I think this change would go in 2.13 since it changes types a bit... but I assume this is for a bug fix that'd be good to get out sooner.

Thanks I appreciate the flexibility! It would be really great if we could get this into 2.12.

About the only suggestion I have is that in addition to a fail case (in which it might make sense to check that exception message has message expected), it'd be good to have a passing case that exercises the new code path. Some existing tests probably do that already, but something explicit is nice to guard against regressions.

Sure, I'll add a test against the message and add some more tests for IonSexps embedded in objects, other IonSexps, and as top-level values.

cowtowncoder commented 3 years ago

@mcliedtke right, NATURAL semantic imply "default" handling of some kind, and POJO property key/value aspect (and really have more to do with Java value side than output representation).

Format-specific formatting aspects is one part of configurability that is sort of missing from Jackson: while it is now possible to more easily have format-specific read/write settings, those are global and very simple (on/off). Conversely it is also quite easy to have per-mapper-format-specific settings of any kind (via ObjectMapper.Builder, sub-classable), but those are also immutable and global (that is, have to be set before use and cannot be change on per-read or per-write basis; nor scoped to specific property).

Another possible approach is to improve support to format-specific value objects (there is JsonGenerator.writeEmbeddedObject() for writing, and JsonToken.VALUE_EMBEDDED_OBJECT for reading as well as JsonParser.getEmbeddedObject()). This was mostly added to support exposing binary data (where it exists natively and not as base64-embedded String) but has been used for other types. Probably the most urgent need for extension in this context are "logical types" that CBOR and Avro expose; that is, things that are physically encoded as something (String, Object, Array) but have logical structure or value type specified by encoding or schema. The challenge is that of exposing and using such mappings at streaming API level, but also used by databinding (in a way replacing general-purpose databinding).

Above are just concepts that exist and might be usable; I don't know how exactly they would apply.

mcliedtke commented 3 years ago

@cowtowncoder Thanks for the background, I'll poke around and maybe create a issue for it.

Latest changes LGTM

jhhladky commented 3 years ago

Any blockers to merging this? Thanks!

cowtowncoder commented 3 years ago

Just my time: Jackson is just my avocation, not job. This change will likely not merge cleanly to master and I'll need to figure that out. This is on top of my todo list fwtw.

cowtowncoder commented 3 years ago

Was able to merge to master; also realized that pretty-printing does not really work as is and filed a new issue (#245). Will need help resolving that one, but at least this is merged.

jhhladky commented 3 years ago

Thanks for merging this! I really appreciate the prompt responses.

cowtowncoder commented 3 years ago

@JacekLach np! Do you think there are other changes coming soon -- I am thinking of possibly releasing 2.12.2 over the weekend, but wanted to see if there might be something else pending.

jhhladky commented 3 years ago

Nothing else from my end :)