FasterXML / jackson-dataformat-xml

Extension for Jackson JSON processor that adds support for serializing POJOs as XML (and deserializing from XML) as an alternative to JSON
Apache License 2.0
567 stars 221 forks source link

`XmlMapper` output not well-formed when Object keys use invalid XML name characters #511

Open rlbns opened 2 years ago

rlbns commented 2 years ago

I've been working with a lot of web APIs and converting the JSON to XML using XmlMapper. It generally works great, but I have an example that creates invalid XML. I am using the latest version of Jackson (2.13.1).

When you call this API you get the attached JSON output. https://world.openfoodfacts.org/api/v0/product/7622300315733.json

The Jackson pretty-printer has no problem indenting this nicely. BadXmlFromJson.zip

After converting it to XML (also in the attached ZIP) you get some issues that prevent Xerces or Saxon from parsing it:

  1. Namespace prefixes are not declared
  2. Some elements are not well formed

For example, at line 48 you see this: <agribalyse_proxy_food_code:en>12315</agribalyse_proxy_food_code:en>

and at line 653 you see this: <1> which is an error because XML names must start with a letter or underscore.

I'm using XmlMapper like this:

        XmlMapper xmlMapper = new XmlMapper();
        return xmlMapper.writeValueAsString(jsonTree);

I can't find any features in FromXmlParser.Feature or ToXmlGenerator.Feature that apply to either of these issues.

Is there another way to configure XmlMapper or is this a bug?

cowtowncoder commented 1 year ago

It'd be nice to have test inlined here, instead of as a zip archive. But aside from that yes, there is a problem in using non-XML-name characters in Map keys or as POJO property names -- names will be used as-is, since XML has no mechanism for escaping name characters, so there is no way to safely translate names to contain such characters.

However, #531 (added in 2.14.0) does add a mechanism that might work: it will allow translation using convention which does allow avoiding this problem.