dapper91 / pydantic-xml

python xml for humans
https://pydantic-xml.readthedocs.io
The Unlicense
141 stars 14 forks source link

Default namespace not respected in child element #137

Closed psongers closed 8 months ago

psongers commented 8 months ago

Hi,

I'm trying to parse an XML file from a third party with a structure like:

<?xml version="1.0" encoding="utf-8"?>                                
<soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">  
<soap:Body>                                                       
  <X xmlns="http://www.example.com/">      
   <Y>abc</Y>
  </X>                                       
</soap:Body>                                                      
</soap:Envelope> 

I have defined these models:

class X(BaseXmlModel, tag="X", nsmap={"": "http://www.example.com/"}):
    result: str = element(tag="Y")

class B(BaseXmlModel, ns="soap", tag="Body"):
    x: X

class E(
    BaseXmlModel,
    tag="Envelope",
    ns="soap",
    nsmap={
        "soap": "http://schemas.xmlsoap.org/soap/envelope/",
        "xsd": "http://www.w3.org/2001/XMLSchema",
        "xsi": "http://www.w3.org/2001/XMLSchema-instance",
    },
):
    b: B

This fails to work when parsing the XML with e with stack trace:

Traceback (most recent call last):
  File "...", line 38, in <module>
    pp = E.from_xml(x.encode())
         ^^^^^^^^^^^^^^^^^^^^^^
  File "...", line 346, in from_xml
    return cls.from_xml_tree(etree.fromstring(source), context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...y", line 329, in from_xml_tree
    obj = typing.cast(ModelT, cls.__xml_serializer__.deserialize(XmlElement.from_native(root), context=context))
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...", line 178, in deserialize
    result = {
             ^
  File "...", line 181, in <dictcomp>
    if (field_value := field_serializer.deserialize(element, context=context)) is not None
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...", line 358, in deserialize
    return self._model.__xml_serializer__.deserialize(sub_element, context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...", line 187, in deserialize
    return self._model.model_validate(result, strict=False, context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "...", line 503, in model_validate
    return cls.__pydantic_validator__.validate_python(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for B
x
  Field required [type=missing, input_value={}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.4/v/missing

Doing some digging. I think the problematic line is here: https://github.com/dapper91/pydantic-xml/blob/master/pydantic_xml/serializers/factories/model.py#L286

its passing the parents ns "soap" which does not exist in the nsmap and the qualified name becomes X instead of {"http://www.example.com/"}X

dapper91 commented 8 months ago

It is the intended behavior that sub-model inherits its parent namespace implicitly if it is not explicitly defined. So it your case you should define X namespace like this:

class X(BaseXmlModel, tag="X", ns='', nsmap={"": "http://www.example.com/"}):
    result: str = element(tag="Y")

or this:

class B(BaseXmlModel, ns="soap", tag="Body"):
    x: X = element(ns='')
dapper91 commented 8 months ago

bug fixed in version 2.4.0