FasterXML / jackson-dataformat-xml

Extension for Jackson JSON processor that adds support for serializing POJOs as XML (and deserializing from XML) as an alternative to JSON
Apache License 2.0
567 stars 221 forks source link

Prefixed element is not supported #515

Closed mincong-h closed 2 years ago

mincong-h commented 2 years ago

Currently if there are two elements with the same name, one with prefix and the other without prefix, Jackson XML mapper cannot read them correctly. For example, in the following structure, mapper fails to recognize the "media:content" properly:

<post>
  <content>&lt;p&gt;Hello world!&lt;/p&gt;</content>
  <media:content medium="image" url="http://example.com" xmlns:media="http://search.yahoo.com/mrss/" />
</post>

I discovered this when trying to read the feed (RSS) of my blog (https://mincong.io/feed.xml). The XML structure is generated by Jekyll Feed plugin (source code), where both content and media:content exist, so I don't want to modify them if possible. I am using Jackson XML in version 2.13.2.

Here is the test suite to reproduce the problem, where the 2nd test fails with content as empty string:

import com.fasterxml.jackson.dataformat.xml.XmlMapper;
import com.fasterxml.jackson.dataformat.xml.annotation.JacksonXmlProperty;
import com.fasterxml.jackson.dataformat.xml.annotation.JacksonXmlRootElement;
import org.junit.jupiter.api.Test;

import static org.assertj.core.api.Assertions.assertThat;

public class PrefixedElementTest {

    @JacksonXmlRootElement(localName = "post")
    static class MyPost {
        @JacksonXmlProperty(localName = "content")
        String content;
    }

    @Test
    void withoutMediaContent() throws Exception {
        var mapper = new XmlMapper();
        var post = mapper.readValue("""
        <post>
            <content>&lt;p&gt;Hello world!&lt;/p&gt;</content>
        </post>
        """, MyPost.class);
        assertThat(post.content).isEqualTo("<p>Hello world!</p>");
    }

    @Test
    void withMediaContent() throws Exception {
        var mapper = new XmlMapper();
        var post = mapper.readValue("""
        <post>
            <content>&lt;p&gt;Hello world!&lt;/p&gt;</content>
            <media:content medium="image" url="http://example.com" xmlns:media="http://search.yahoo.com/mrss/" />
        </post>
        """, MyPost.class);

        // org.opentest4j.AssertionFailedError: 
        // Expecting:
        // <"">
        // to be equal to:
        // <"<p>Hello world!</p>">
        // but was not.
        assertThat(post.content).isEqualTo("<p>Hello world!</p>");
    }
}
cowtowncoder commented 2 years ago

Correct: Jackson XML module does not allow elements or attributes that only differ by namespace (prefixes map to namespaces in XML). This is a limitation that exists currently and there are no current plans to change this behavior -- theoretically it could be fixed, of course, as Jackson does keep track of declared namespace. But only local name is currently used for matching properties.

There are existing issues for this, for example #27, #65 and #192 so I will close this as duplicate.

mincong-h commented 2 years ago

Thanks for your quick reply @cowtowncoder , really appreciate that. Your explanation is very clear. 👍 By the way, I ended up using jsoup as the replacement and relying on CSS selector to find the right tag (content or media:content).

cowtowncoder commented 2 years ago

@mincong-h thank you -- and apologies for having to go through the workaround. Ideally it really should "just work". But I am glad you were able to find a workaround.