FasterXML / jackson-dataformat-xml

Extension for Jackson JSON processor that adds support for serializing POJOs as XML (and deserializing from XML) as an alternative to JSON
Apache License 2.0
561 stars 221 forks source link

[DRAFT] Support Encoding option when writing XML #593

Open pjfanning opened 1 year ago

pjfanning commented 1 year ago
pjfanning commented 1 year ago

@cowtowncoder when you get a chance, could you review the work in progress? If the new XmlMapper methods look ok, I can extend the coverage on them.

cowtowncoder commented 1 year ago

Makes sense for the most part. The only thing I am not sure how I feel about is addition of more writeValue() methods in ObjectMapper: I understand convenience aspect but this introduces discrepancy between ObjectWriter and ObjectMapper. There's also the problem of combinatorial expansion if we wanted to cover all permutations. So in some ways I wonder if we'd require construction of ToXmlGenerator, then using writeValue() passing that: not as convenient, but incremental improvement to first allow doing this, then seeing how to make it more convenient.

I guess if we were to consider extending support for encodings to be core part of jackson-databind, that could be solved better, but obviously bigger undertaking.

So maybe adding methods here only first makes sense.

pjfanning commented 1 year ago

I'm not sure about the idea to force users to create a ToXmlGenerator. I understand that this avoids expanding the number of methods but it just doesn't feel like the natural way for users to use the API. In practice, most Jackson users don't know about JsonGenerators and the various subclasses like ToXmlGenerator.

One alternative approach would be to add a setter on XmlMapper or XmlFactory where users can set a default charset.

The problem here is that if users need to create XML outputs with different charsets that they would need to create an XmlMapper instance for every charset that they need to support. We could have this setting on XmlMapper and allow it to be changed just before any 'write' call. This goes against the aim of making Mapper instances immutable but it might be the tidiest approach to allowing multiple charsets without having to add quite a few new 'write' methods.

If we keep the current approach in this PR of adding new write methods, we could also add them to a new XmlWriter subclass of ObjectWriter.

cowtowncoder commented 1 year ago

I'm not sure about the idea to force users to create a ToXmlGenerator. I understand that this avoids expanding the number of methods but it just doesn't feel like the natural way for users to use the API. In practice, most Jackson users don't know about JsonGenerators and the various subclasses like ToXmlGenerator.

Yes, that is problematic as well.

One alternative approach would be to add a setter on XmlMapper or XmlFactory where users can set a default charset.

That does not seem right either, although of the two, XmlFactory should probably have it if anything.

Could use Builder(); but could not set it on per-call basis.

The problem here is that if users need to create XML outputs with different charsets that they would need to create an XmlMapper instance for every charset that they need to support. We could have this setting on XmlMapper and allow it to be changed just before any 'write' call. This goes against the aim of making Mapper instances immutable but it might be the tidiest approach to allowing multiple charsets without having to add quite a few new 'write' methods.

If we keep the current approach in this PR of adding new write methods, we could also add them to a new XmlWriter subclass of ObjectWriter.

Unfortunately when I tried to make ObjectWriter (and ObjectReader) extensible, I quickly found out that just won't work wrt chaining. This was quite a while ago (5+ years?) but I don't think things have changed.

I need to think about this some more. I am not dead set against new methods in XmlMapper but still trying to think of something better. This might be why I was originally thinking of a handler, callback style approach; configurable on XmlMapper or XmlFactory, but more flexible. Not sure that'd be good either, but might be why I had the idea.

pjfanning commented 1 year ago

@cowtowncoder I added an XmlWriter to this PR - an extension to ObjectWriter but with the additional write methods that I added to XmlMapper. The methods on XmlMapper that return an ObjectWriter will return an XmlMapper that can be cast to get access to the extra methods. I also added a toXmlWriter(ObjectWriter) static method to XmlMapper.

If these API changes are ok, I can add extra test coverage to this PR.

pjfanning commented 11 months ago

@cowtowncoder is this PR worth continuing with?

ifindya commented 11 months ago

Good draft, really needed.

cowtowncoder commented 8 months ago

Ok, so:

  1. I really want to support ability to specify encoding and other XML prolog settings
  2. I do not want ObjectWriter (or ObjectReader) to be sub-classed (I played with this idea at some point but... it doesn't really compose nicely, unfortunately, from what I remember)

given this I don't think this PR as-is will be worth pursuing. But I think it's useful to keep open until an alternative path is found.