tefra / xsdata

Naive XML & JSON Bindings for python
https://xsdata.readthedocs.io
MIT License
334 stars 61 forks source link

Convert XML sample into cannonical python representation code #626

Closed vlcinsky closed 2 years ago

vlcinsky commented 2 years ago

Motivation

We maintain set of XML samples conforming to specific XML schema.

We are considering to generate these samples by python code using xsdata and xsdata generated package.

It would be helpful to have an option to generate the python representation by converting XML samples into it.

Context

The python package allowing to build XML objects depends on XML schema and on backing object library:

If we load an XML sample into xsdata based objects, we may:

Proposed approach - pycode serializer

Currently, we use XmlSerializer

from xsdata.formats.dataclass.serializers import XmlSerializer
from xsdata.formats.dataclass.serializers.config import SerializerConfig

If the backing library provides PycodeSerializer, we can get the python code generated.

Such a serializer would be optional (so new backing libraries would not be obliged to provide that).

To get the python code, one would find in documentation example similar to https://xsdata.readthedocs.io/en/latest/json.html#serialize-json-to-string or we could even add section Python code binding, which would not only serialize.

The code would be similar to:

from tests.fixtures.books.books import *
from xsdata.models.datatype import XmlDate

from xsdata.formats.dataclass.context import XmlContext
from xsdata.formats.dataclass.parsers import XmlParser

from xsdata.formats.dataclass.serializers import PycodeSerializer
from xsdata.formats.dataclass.serializers.config import SerializerConfig

xmlpath = Path("books.xml")
xml_str = xmlpath.read_text(encoding="utf-8")

parser = XmlParser(context=XmlContext())
books = parser.from_string(xml_str, Book)

config = SerializerConfig()
serializer = PycodeSerializer(context=XmlContext(), config=config)

pycode_str = serializer.render(books)

print(pycode_str)

and the generated python code would be:

from tests.fixtures.books.books import *
from xsdata.models.datatype import XmlDate

Books(
   book=[
       BookForm(
           id="bk001",
           author="Hightower, Kim",
           title="The First Book",
           genre="Fiction",
           price=44.95,
           review="An amazing story of nothing.",
       ),
       BookForm(
           id="bk002",
           author="Nagata, Suanne",
           title="Becoming Somebody",
           price=33.95,
           pub_date=XmlDate(2001, 1, 10),
           review="A masterpiece of the fine art of gossiping.",
       ),
   ]
)

Even better support would be using CLI (but it is not really necessary).
tefra commented 2 years ago

Sounds intresting and I can see myself using this, I definitely needed something like that in the past, but I am not so sure how it fits in the xml & json bindings context, it could be a completely unrelated library that can print valid python code from any python object or dataclass.

vlcinsky commented 2 years ago

A code, which would process any python object or dataclass into valid python code would be very straightforward and great. But I am not sure it is as simple as it sounds. I would expect it would be as complex as pickling or as deepcopy code. I will be happy if anyone proves me wrong.

With PycodeSerializer we would give all the control to the author of given library creating the classes used to represent input (XML or JSON) data.

tefra commented 2 years ago

I would gladly accept any contributions on the subject, but it's really not a top priority for a binding library...

tefra commented 2 years ago

Re-opening this one

tefra commented 2 years ago

Hi @vlcinsky I know it's been a while, but the first version is now on master, give it a try

vlcinsky commented 2 years ago

@tefra excellent.

My first tests show, that the code looks good, it has only one issue with the lang parameter.

I assume the reason could be, that the lang is defined on XML level itself and not as part of XSD. If I comment the lang parameter out, things work well.

I will create a tests case for it.

tefra commented 2 years ago

@tefra excellent.

My first tests show, that the code looks good, it has only one issue with the lang parameter.

I assume the reason could be, that the lang is defined on XML level itself and not as part of XSD. If I comment the lang parameter out, things work well.

I will create a tests case for it.

Can you open a new issue with an example? I can take a look

vlcinsky commented 2 years ago

Detailed instructions to reproduce the error are in #697