dapper91 / pydantic-xml

python xml for humans
https://pydantic-xml.readthedocs.io
The Unlicense
155 stars 16 forks source link

Question: Are dynmaic tags possible? #92

Closed ja-albert closed 1 year ago

ja-albert commented 1 year ago

Hello again :) This time with a question rather than an enhancement suggestion:

I'm currently confronted with the problem of having to create XML elements with different tags depending on what content they have (please don't ask me why - I don't know either). For example:

<A>
    <B>...</B>
    <B10>...</B10>
    <B>...</B>
    <B42>...</B42>
<A/>

In this case, there is a B element which may have simply B as its tag, but could also have a number within the tag. For this example, the tag will matches the regex B\d*, but in other cases, arbitrary characters and digits may be possible. In Python, it will probably look something like:

from pydantic_xml import BaseXmlModel

class A(BaseXmlModel):
    content: list[B]

class B(BaseXmlModel, tag="B"):
    size: int | None = None

    # size == None -> tag == "B"
    # isinstance(size, int) -> tag == f"B{size}"

    # ...?

However, I have no idea how to dynamically alter a tag for a specific instance of a BaseXmlModel class - is it even possible? Maybe using wrapped? I already tried it a bit, but could not find the correct attributes to alter an instance.

I also did not found any information on this in the documentation, besides that the dynamic model creation is not supported (what could maybe have been a workaround).

ja-albert commented 1 year ago

It may be that may WME is too minimal - in fact, the class hierarchy is more like this:

from pydantic_xml import BaseXmlModel

class A(BaseXmlModel):
    content: list[B | C]

class B(BaseXmlModel, tag="B"):
    size: int | None = None

    # size == None -> tag == "B"
    # isinstance(size, int) -> tag == f"B{size}"

    # ...?

class C(BaseXmlModel, tag="C"):
    pass

And the resulting XML more like that:


<A>
    <B>...</B>
    <C/>
    <B10>...</B10>
    <C>...</C>
    <B>...</B>
    <B42>...</B42>
    <C/>
<A/>
dapper91 commented 1 year ago

@ja-albert Hi

Unfortunately there is no way to specify an element tag dynamically based on its content because the model serializer is built on model definition and can't be altered afterwards.

The simplest way I came up with is to modify the xml tree before serialization and deserialization:

class A(BaseXmlModel):
    content: list[B | C]

    @classmethod
    def from_xml_tree(cls, root: etree.Element, context: Optional[Dict[str, Any]] = None) -> 'A':
        for element in root:
            if re.fullmatch(r'B\d+', element.tag):
                element.tag = 'B'
            elif re.fullmatch(r'C\d+', element.tag):
                element.tag = 'C'

        return super().from_xml_tree(root, context)

    def to_xml_tree(self, *, skip_empty: bool = False) -> etree.Element:
        root = super().to_xml_tree(skip_empty=skip_empty)
        for sub_element in root:
            if sub_element.tag in ('B', 'C') and sub_element.text:
                sub_element.tag = f'{sub_element.tag}{sub_element.text}'

        return root

but it will only work if A is a root element.

ja-albert commented 1 year ago

Thank you for your answer!

I could get it to work using an approach similar to yours: In the to_xml_tree method, I generate the root element and then use an XPath-Expression to find all the elements that have to be post-processed. Similar to your code, I manipulate the elements so that they have the correct tags and attributes afterward. It ain't pretty, but it works :man_shrugging: