FasterXML / jackson-dataformat-xml

Extension for Jackson JSON processor that adds support for serializing POJOs as XML (and deserializing from XML) as an alternative to JSON
Apache License 2.0
561 stars 221 forks source link

Deserialization of `null` String values in Arrays / `Collection`s not working as expected #584

Closed mbladel closed 1 year ago

mbladel commented 1 year ago

Environment: Jackson Dataformat XML 2.14.2

We are trying to serialize and deserialize arrays or collections of Strings containing both empty and null values. We noticed that nulls are handled correctly when serializing:

When parsing this values back null elements have become empty ("") values. We tried:

Testing showed that with simple String attributes inside a class both null and empty values are correctly handled during deserialization, as expected (see also https://github.com/FasterXML/jackson-dataformat-xml/issues/354).

The behavior of String values inside collections should be aligned to how String attributes inside objects are handled.

Here's an example:

XmlMapper xmlMapper = new XmlMapper();
xmlMapper.enable( ToXmlGenerator.Feature.WRITE_NULLS_AS_XSI_NIL );
xmlMapper.enable( FromXmlParser.Feature.PROCESS_XSI_NIL );

String string = xmlMapper.writeValueAsString( new String[] { "", "test", null, "test2" } );
// string: <Strings><item></item><item>test</item><item xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:nil="true"/><item>test2</item></Strings>
String[] parsed =  xmlMapper.readValue( string, String[].class );
// parsed: ["", "test", "", "test2"]
cowtowncoder commented 1 year ago

Quick note(s):

<item /> and <item></item> are semantically equivalent in XML and although (some) parsers can indicate difference (Woodstox does) -- and most can produce one or the other explicitly -- it's not a good idea to try to use difference for semantics, due to interoperability concerns.

Having said that, yes, handling of String values in Arrays or typed Collections should work the same way as individual properties. Handling of difference between null and empty String are tricky, and I think Array/Collection deserializers probably have some shortcuts that would need to be disabled for XML.

mbladel commented 1 year ago

Thanks for the feedback.

<item /> and <item></item> are semantically equivalent in XML

I agree that's working as expected, that is why I went for the xsi:nil approach in the example - just wanted to exhaust all possibilities with the different configurations/features and let you know what I found from my testing.

cowtowncoder commented 1 year ago

@mbladel yes good. I did realize it is/was not the main point. Mostly mentioned since at some point I was trying to consider/expose distinction and spent some time thinking about it.