tefra / xsdata

Naive XML & JSON Bindings for python
https://xsdata.readthedocs.io
MIT License
324 stars 59 forks source link

Allow a generator to be provided instead of a List #1030

Open skinkie opened 5 months ago

skinkie commented 5 months ago

πŸ“’ Description

Considering you are writing a very big tree, you don't want to materialise that tree before it ends up in a file in a list first. Ideally the entire subtree should only be rendered just in time.

πŸ”— What I've Done

I have allowed a Generator to be handled as a List.

πŸ’¬ Comments

There might be more places this needs to be changed. Since List[] is used everywhere, type hinting fails.

πŸ›« Checklist

import sqlite3

from xsdata.formats.dataclass.context import XmlContext
from xsdata.formats.dataclass.parsers import XmlParser
from xsdata.formats.dataclass.parsers.config import ParserConfig
from xsdata.formats.dataclass.parsers.handlers import LxmlEventHandler
from xsdata.formats.dataclass.serializers import XmlSerializer
from xsdata.formats.dataclass.serializers.config import SerializerConfig
from xsdata.models.datatype import XmlDateTime

from netex import PublicationDelivery, ParticipantRef, MultilingualString, DataObjectsRelStructure, GeneralFrame, \
    GeneralFrameMembersRelStructure, ServiceJourney

serializer_config = SerializerConfig(ignore_default_attributes=True, xml_declaration=True)
serializer_config.pretty_print = True
serializer_config.ignore_default_attributes = True
serializer = XmlSerializer(config=serializer_config)

context = XmlContext()
config = ParserConfig(fail_on_unknown_properties=False)
parser = XmlParser(context=context, config=config, handler=LxmlEventHandler)

def load_generator(con, clazz, limit=None):
    type = getattr(clazz.Meta, 'name', clazz.__name__)

    cur = con.cursor()
    if limit is None:
        cur.execute(f"SELECT object FROM {type};")
    else:
        cur.execute(f"SELECT object FROM {type} LIMIT {limit};")

    while True:
        xml = cur.fetchone()
        if xml is None:
            break
        yield parser.from_bytes(xml[0], clazz)

with sqlite3.connect("/tmp/netex.sqlite") as con:
    publication_delivery = PublicationDelivery(
                publication_timestamp=XmlDateTime.now(),
                participant_ref=ParticipantRef(value="NDOV"),
                description=MultilingualString(value="Huge XML Serializer test"),
                data_objects=DataObjectsRelStructure(choice=[GeneralFrame(members=GeneralFrameMembersRelStructure(choice=load_generator(con, ServiceJourney, 10)))]),
                version="ntx:1.1",
            )

ns_map = {'': 'http://www.netex.org.uk/netex', 'gml': 'http://www.opengis.net/gml/3.2'}
with open('netex-output/huge.xml', 'w') as out:
    serializer.write(out, publication_delivery, ns_map)
codecov[bot] commented 5 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 100.00%. Comparing base (cce0a16) to head (d4b3e5b).

:exclamation: Current head d4b3e5b differs from pull request most recent head 67d847f

Please upload reports for the commit 67d847f to get more accurate results.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1030 +/- ## ========================================= Coverage 100.00% 100.00% ========================================= Files 115 115 Lines 9238 9265 +27 Branches 2179 2190 +11 ========================================= + Hits 9238 9265 +27 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

tefra commented 5 months ago

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

skinkie commented 5 months ago

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

But would this something you would support from an architecture point of view?

tefra commented 5 months ago

We need to properly support the Iterable type annotation in model fields, and of course support this for both xml/json serialization @skinkie.

But would this something you would support from an architecture point of view?

Yes

skinkie commented 4 months ago

@tefra How would you like to proceed? Materialise List or Tuple in the tests?

skinkie commented 4 months ago

Testing it with my own code results in this error. So it is clearly not done yet.

xsdata.exceptions.XmlContextError: Error on DataObjectsRelStructure::choice: Xml Elements does not support typing `typing.Iterable[typing.Union[netex.general_version_frame_structure.CompositeFrame, netex.mobility_journey_frame.MobilityJourneyFrame, netex.mobility_service_frame.MobilityServiceFrame, netex.sales_transaction_frame.SalesTransactionFrame, netex.fare_frame.FareFrame, netex.driver_schedule_frame.DriverScheduleFrame, netex.vehicle_schedule_frame.VehicleScheduleFrame, netex.service_frame.ServiceFrame, netex.timetable_frame.TimetableFrame, netex.site_frame.SiteFrame, netex.infrastructure_frame.InfrastructureFrame, netex.general_version_frame_structure.GeneralFrame, netex.resource_frame.ResourceFrame, netex.service_calendar_frame.ServiceCalendarFrame]]`
sonarcloud[bot] commented 4 months ago

Quality Gate Failed Quality Gate failed

Failed conditions
1 Security Hotspot

See analysis details on SonarCloud

skinkie commented 4 months ago

Parsing breaks with Iterable.

        if tokens_factory:
            value = value if collections.is_array(value) else value.split()
            return tokens_factory(
                converter.deserialize(val, types, ns_map=ns_map, format=format)
                for val in value
skinkie commented 3 months ago

Parsing solved.

skinkie commented 3 months ago

While it looked like 'working' again, I noticed that Iterable again breaks the parsing.

sonarcloud[bot] commented 2 months ago

Quality Gate Passed Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud