gatkin / declxml

Declarative XML processing for Python
https://declxml.readthedocs.io/en/latest/
MIT License
37 stars 7 forks source link

non-required named_tuple #24

Closed mhemeryck closed 5 years ago

mhemeryck commented 5 years ago

Hi, first thanks for the nice library, looks well-designed and fits the needs of a project I am currently working on.

I have a (heavily) nested document structure, where I use a myriad of named tuples to represent the structure.

When testing it on a real-world example, the named tuple processors work OK in the case the related XML is there, but it does start acting up in case it's not -- even though I specifically set the processor to consider it non-required.

Snippets: (slightly altered)

Processor:

import declxml as xml

processor = xml.named_tuple(                                                            
    "Invoice",                                                                          
    Invoice,                                                                            
    [
...
        xml.named_tuple(                                                                 
            "Reference",                                                 
            Reference,                                                   
            [xml.string("ID")],                                                          
            required=False,                                                              
        ),   
    ]
)

Named tuple definition:

Reference  = namedtuple("Reference", ["ID"])

Issue I am getting is:

dict_value = {}

    def _from_dict(dict_value):
>       return tuple_type(**dict_value)
E       TypeError: __new__() takes exactly 2 arguments (1 given)

My best guess is that it has something to do with the named_tuple parsing, where it doesn't really check for any non-required values: https://github.com/gatkin/declxml/blob/afefe8afcb69988771ecd2c5fddbcafa545df888/declxml.py#L1515

Can you confirm, or am I missing something?

I did check when specifying it an array (much like the examples given in the docs), it does work, but I actually want a 0 or 1 relationship, rather than a 0 or n relationship.

mhemeryck commented 5 years ago

Confirmed the issue for me with a more self-contained example:

from collections import namedtuple

import declxml as xml

Author = namedtuple("Author", ["name", "birth_year", "genre"])
Genre = namedtuple("Genre", ["name"])

processor = xml.named_tuple(
    "author",
    Author,
    [ 
        xml.string("name"),
        xml.integer("birth-year", alias="birth_year"),
        xml.named_tuple("genre", Genre, [xml.string("name")], required=False),
    ],
)

author_xml = """
<author>
    <name>Robert A. Heinlein</name>
    <birth-year>1907</birth-year>
</author>
"""
# author_xml = """
# <author>
#     <name>Robert A. Heinlein</name>
#     <birth-year>1907</birth-year>
#     <genre>
#         <name>horror</name>
#     </genre>
# </author>
# """

author = xml.parse_from_string(processor, author_xml)

Parsing works OK when having the "genre" in the xml, not when I leave it out.

Tested both for python 2.7 and 3.7

gatkin commented 5 years ago

Thank you so much for letting me know about this issue! I'm very glad to hear that this library has been useful for you.

This issue has been fixed in https://github.com/gatkin/declxml/commit/8746a32615387020c64827c4d33e3ee9ff164bea and is available on PyPi under version 1.1.3 of the decxml package.

Please let me know if you have any issues with the new version and happy coding!