FasterXML / jackson-dataformat-xml

Extension for Jackson JSON processor that adds support for serializing POJOs as XML (and deserializing from XML) as an alternative to JSON
Apache License 2.0
561 stars 221 forks source link

2.14 breaking change: parsing xml string as JsonNode/Object used to return just a String, now returns an Object/Map #587

Closed jeronimonunes closed 5 months ago

jeronimonunes commented 1 year ago

Breaking change in my application when upgrading from 2.13.5 to 2.14

xmlMapper.readValue(xmlStreamReader, JsonNode.class)

with string

<tag>13,2</tag>

used to return "13,2". Now it returns a map {"": "13,2"}

Same thing happens if I use xmlMapper.readValue(xmlStreamReader, Object.class)

cowtowncoder commented 1 year ago

I would need actual reproduction: string "13,2" is not valid XML so I'd need to know actual input in question.

I don't doubt there may be a change (this is likely for improved handling of JsonNode as target, to allow handling of duplicate elements), just need to know what exactly is being attempte.

jeronimonunes commented 1 year ago
    @Test
    void test() throws XMLStreamException, IOException {
        var factory = new WstxInputFactory();
        var stringReader = new StringReader("<d>13,2</d>");
        var xmlStreamReader = factory.createXMLStreamReader(stringReader);
        xmlStreamReader.nextTag();
        var jsonNode = xmlMapper.readValue(xmlStreamReader, JsonNode.class);
        assertEquals(TextNode.class,jsonNode.getClass());
    }
Expected :class com.fasterxml.jackson.databind.node.TextNode
Actual   :class com.fasterxml.jackson.databind.node.ObjectNode
jeronimonunes commented 1 year ago

Sorry, I should have written the whole xml in the description. About the example above, same thing happens if I try to pass Object.class to readValue. It used to return a String, now it returns a Map.

cowtowncoder commented 1 year ago

Hmmh. Ok, I wrote first answer, saying this is expected... but then started thinking description was not correct.

What I do think is happening is that the change is due to improved handling of root level POJOs; and related to ambiguity of java.lang.Object as target; for which both Map and String are legal... and where with XML there is bit of natural ambiguity.

I suspect that the new behavior is what we will be going with, for what that is worth.

But I think that target type of String.class would give you the old behavior.

jeronimonunes commented 1 year ago

It does. I was first decoding everything to JsonNode and later using treeToValue to convert to other types.

Maybe Jackson can see that "" -> "ABC" is not a legal map because it only have "" as key and then return only the value?

jeronimonunes commented 1 year ago

I'm handling huge xmls and was keeping a temporary JsonNode before converting to the final type so I could log the content of that tag if the conversation failed. Do you know another temporary object I can use to store a single Xml tag for later conversion?

cowtowncoder commented 1 year ago

Empty String ("") is a valid JSON key.

Unfortunately buffering by Jackson uses TokenBuffer, which retains only converted JsonTokens, not the underlying XML contents.

I would have recommended trying JsonNode, but it probably has similar behavior. I cannot think of anything that would safely buffer everything without transformation, unfortunately.

It may be necessary to do buffering at lower level (Reader or InputStream used), and use that for diagnostics on failure: JsonFactory (or XmlFactory subtype) has method

public JsonFactory setInputDecorator(InputDecorator d)

that may make this approach simpler (although if you control caller you can just wrap Reader / InputStream on your own).

patryk-wojcik-db commented 5 months ago

I can confirm that updating from 2.13.3 to 2.14.x or higher introduces breaking change.

    @Test //This test is going to fail if com.fasterxml.jackson.dataformat.jackson-dataformat-xml is updated to 2.14.x or higher due to breaking change introduced in jackson-dataformat-xml 2.14
    void testTextNode() throws IOException {
        String exampleXml = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<text>Hello World</text>\n</xml>";

        XMLInputFactory input = new WstxInputFactory();
        input.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, Boolean.FALSE);

        JsonNode rootNode = new XmlMapper(new XmlFactory(input, new WstxOutputFactory())).readTree(
                exampleXml.getBytes());

        Assertions.assertAll(
                () -> Assertions.assertTrue(rootNode.isTextual()),
                () -> Assertions.assertEquals("Hello World", rootNode.textValue()));
    }

Above test passes for jackson-dataformat-xml 2.13.3: image

and fails for any version above that due to difference in node handling: image

cowtowncoder commented 5 months ago

New behavior -- binding as ObjectNode (for target type of JsonNode) and Map (target type of java.lang.Object) -- is considered correct and expected behavior for Jackson. Will close the issue.

That said, I would be open to idea of having configuration option to allow alternate handling, if anyone wants to try creating a PR (if so, it'd be FromXmlParser.Feature)