pdvrieze / xmlutil

XML Serialization library for Kotlin
https://pdvrieze.github.io/xmlutil/
Apache License 2.0
379 stars 30 forks source link

Node fails to parse, and polymorphic Any type omits items #225

Open joejensen8 opened 4 months ago

joejensen8 commented 4 months ago

I'm trying to parse XML for Android string files, here's an example:

  <?xml version="1.0" encoding="utf-8"?>
  <resources xmlns:xliff="urn:oasis:names:tc:xliff:document:1.2">
      <string name="string_android">Test with argument <xliff:g id="argument">%1s</xliff:g> here</string>
      <string name="test_string_2">test 2</string>
  </resources>

NOTE - the XML value within each tag is unique in that it can contain nested values like the xliff:g tags shown.

In my example code below, I'm unable to unpack the value within the tags in any form to get it in its entirity.

import kotlinx.serialization.Serializable
import kotlinx.serialization.builtins.serializer
import kotlinx.serialization.modules.SerializersModule
import nl.adaptivity.xmlutil.ExperimentalXmlUtilApi
import nl.adaptivity.xmlutil.serialization.DefaultXmlSerializationPolicy
import nl.adaptivity.xmlutil.serialization.ElementSerializer
import nl.adaptivity.xmlutil.serialization.NodeSerializer
import nl.adaptivity.xmlutil.serialization.XML
import nl.adaptivity.xmlutil.serialization.XmlElement
import nl.adaptivity.xmlutil.serialization.XmlSerialName
import nl.adaptivity.xmlutil.serialization.XmlValue
import org.w3c.dom.Element
import org.w3c.dom.Node

object TestXmlParse {

    val example = """
        <?xml version="1.0" encoding="utf-8"?>
        <resources xmlns:xliff="urn:oasis:names:tc:xliff:document:1.2">
            <string name="string_android">Test with argument <xliff:g id="argument">%1s</xliff:g> here</string>
            <string name="test_string_2">test 2</string>
        </resources>
    """.trimIndent()

    @Serializable
    @XmlSerialName("string")
    data class StringTag(
        @XmlElement(false)
        val name: String,
        @XmlValue
        val node: List<@Polymorphic Any>,
    )

    @Serializable
    @XmlSerialName("resources")
    data class Resources(
        val strings: List<StringTag>,
    )

    @OptIn(ExperimentalXmlUtilApi::class)
    fun parseXml() {
        val xml = XML(
            serializersModule = SerializersModule {
                polymorphic(Any::class, String::class, String.serializer())
                polymorphic(Any::class, Node::class, NodeSerializer)
                polymorphic(Any::class, Element::class, ElementSerializer)
            },
        ) {
            // Empty "unknown child" handler
            policy = DefaultXmlSerializationPolicy(
                pedantic = true,
                unknownChildHandler = { _, _, _, _, _ -> emptyList() },
            )
            recommended {
                autoPolymorphic = false
            }
        }
        val resources = xml.decodeFromString(Resources.serializer(), example)
        println(
            resources
        )
    }
}

Test file to debug

import kotlin.test.Test

class TestXmlParseTest {

    @Test
    fun test() {
        TestXmlParse.parseXml()
    }
}

The line of focus above is val node: List<@Polymorphic Any> in the StringTag data class.

When it is val node: List<@Polymorphic Any>, I get this captured data model:

image

The problem above is on the first item with 2 values (called "node"). It captures the strings before and after the <xliff:g> tag, but not that tag itself.

When it is any of the following, I get the error below.

kotlinx.serialization.SerializationException: Class 'kotlin.String' is not registered for polymorphic serialization in the scope of 'Node'.
To be registered automatically, class 'kotlin.String' has to be '@Serializable', and the base class 'Node' has to be sealed and '@Serializable'.
Alternatively, register the serializer for 'kotlin.String' explicitly in a corresponding SerializersModule.

Perhaps I'm missing something, or doing something wrong entirely? I just want to parse the value within the <string> tag on an Android strings file, capturing all items within.

pdvrieze commented 4 months ago

There are some issues, but it should work the following way: