FasterXML / jackson-dataformats-binary

Uber-project for standard Jackson binary format backends: avro, cbor, ion, protobuf, smile
Apache License 2.0
314 stars 136 forks source link

IonObjectMapper close()s the provided IonWriter unnecessarily #189

Closed zslayton closed 4 years ago

zslayton commented 4 years ago

Description

It is not currently possible to re-use an IonWriter to serialize several values across multiple calls to IonObjectMapper#writeValue(IonWriter, Object). I believe this is unintentional, as the documentation for that method states that the method will not close the underlying writer.

Impact

The current behavior has a few downsides:

Cause

The ObjectMapper class's implementation of _configAndWriteValue calls close() on the JsonGenerator that's used to serialize the provided value.

Ion's implementation of JsonGenerator is IonGenerator, and its implementation of close() attempts to flush the IonWriter by calling close() on it.

Suggested Fix

I believe that IonGenerator#close should only call _writer.close() if _ioContext.isResourceManaged() is true. Otherwise, the user provided a reference to the IonWriter being used and should have the authority to flush()/finish()/close() it at their own discretion.

Please let me know if you agree with this assessment. I'd be happy to put together a PR with this change. Thanks!

zslayton commented 4 years ago

I started writing a PR for this and realized that the fix is slightly more complex than I thought. _ioContext.isResourceManaged() is false for both of these code paths:

  1. IonObjectMapper#writeValue(OutputStream, Object)
  2. IonObjectMapper#writeValue(IonWriter, Object)

However, we need IonGenerator#close to call IonWriter#close for the OutputStream flavor (because it creates a new IonWriter) and we need it to not call IonWriter#close for the IonWriter flavor (because the user might wish to continue using the same IonWriter.)

I've created a patch that adds a new member field to IonGenerator: boolean _ionWriterIsManaged. It tracks whether the IonWriter being used is a one-shot instance that should be close()d when the IonGenerator is closed.

I'd like to bump the ion-java dependency to the latest available version too. Can I do that in the same commit, or would you like a separate PR?

cowtowncoder commented 4 years ago

Maybe do separate ones, although not a big deal if they were combined. Also to consider is the branch to base PR off: I think you are right in that "is managed" state missing is a bug so it could go in 2.10, but at the same time I am hoping to have aggressive timeline for 2.11 so that might be better target. Still, adoption of Ion backend is relatively slow so I am not against going with 2.10.

Plan itself sounds right, and the only thing I would add it that for many uses it might be better to use (or implement, if support missing) ObjectWriter.writeValues() mechanism (returns SequenceWriter) that is designed to keep underlying writer open unless SequenceWriter itself is closed. Still, since there is existing override for IonWriter that should work the way it is documented.

zslayton commented 4 years ago

Plan itself sounds right, and the only thing I would add it that for many uses it might be better to use (or implement, if support missing) ObjectWriter.writeValues() mechanism (returns SequenceWriter) that is designed to keep underlying writer open unless SequenceWriter itself is closed.

Ah, interesting! I wasn't aware of the SequenceWriter API, thank you for pointing it out. It looks like the SequenceWriter already works, but it doesn't expose any way to call finish() on the underlying IonWriter. You could write several values, but they would all be buffered in the IonWriter until you called SequenceWriter#close. I'll have to open another PR to create an IonSequenceWriter with that functionality.

I've opened PR #190 to prevent IonObjectMapper#writeValue(IonWriter, Object) from closing the provided IonWriter. The change targets version 2.10 and includes an ion-java version bump to 1.5.1, the latest stable release.

Assuming that it can be merged, how do you normally propagate this sort of change to later Jackson release branches? (2.11, 3.x) Will I need to open similar PRs for later versions, or do you have a process for porting change sets?

cowtowncoder commented 4 years ago

I will merge forward fixes; 2.11 should be easy and clean, but from there to master manual conflict resolution is often needed. So no need to file anything other than target branch.

Occasionally merge-to-master, cherry-pick back is also used. But usually easier to from oldest-to-newest.

Interesting point on SequenceWriter needing additional clean up on close. May need to think of how to make that pluggable, as sub-classing is not necessarily easy at this point just because whereas ObjectMapper is now considered sub-classable by format modules (that is, there now needs to be XxxMapper for all, esp. with 3.x), ObjectWriter is not format-specific.

cowtowncoder commented 4 years ago

Merged, thank you! I hope I did not mangle merge to master -- tests still pass but... just in case.

zslayton commented 4 years ago

Great, thanks!

I hope I did not mangle merge to master -- tests still pass but... just in case.

I took a look at the merge commits and didn't notice anything amiss. Should I check out the master branch and do some testing before I close the issue out?

cowtowncoder commented 4 years ago

@zslayton either way is fine; if it's easy to test with master that'd be nice but not really necessary. Although eventually it would definitely be great to have someone who uses Ion module to verify functioning beyond unit tests -- test suite is bit limited at this point. Of course improvements to the test suite would be extremely valuable too.

zslayton commented 4 years ago

I'm going to go ahead and close this out. If we run into any issues with the merge I'll open another PR to address it. Thanks!