tefra / xsdata

Naive XML & JSON Bindings for python
https://xsdata.readthedocs.io
MIT License
324 stars 59 forks source link

For unions the types are more elegant #992

Closed skinkie closed 6 months ago

skinkie commented 6 months ago

Consider the following generated code: https://github.com/tefra/xsdata-samples/blob/main/netex/models/stop_point_in_journey_pattern_versioned_child_structure.py#L39

The scheduled_stop_point_ref gets a very pretty type FareScheduledStopPointRef | ScheduledStopPointRef. What is the reason onward_timing_link_ref gets the 'RefStructure' (type) variant? Is there a possibility to get the pretty variant for the ordinary elements as well?

        <xsd:group name="StopPointInJourneyPatternTimingGroup">
                <xsd:annotation> 
                        <xsd:documentation>TiingPoint Elements for a STOP POINT IN JOURNEY PATTERN.</xsd:documentation>
                </xsd:annotation>
                <xsd:sequence>  
                        <xsd:element ref="ScheduledStopPointRef"/>
                        <xsd:element name="OnwardTimingLinkRef" type="TimingLinkRefStructure" minOccurs="0">
                                <xsd:annotation>
                                        <xsd:documentation>Onward link - used to disambiguate if there are multiple links from the same stop, e.g. as for cloverleaf route topology. If not given explicitly assume there is only one link that connects the two.</xsd:documentation>
                                </xsd:annotation>
                        </xsd:element>  
                        <xsd:group ref="TimingPointWaitGroup"/>
                </xsd:sequence> 
        </xsd:group>

        <xsd:element name="ScheduledStopPointRef" type="ScheduledStopPointRefStructure" substitutionGroup="TimingPointRef">
                <xsd:annotation>
                        <xsd:documentation>Reference to a SCHEDULED STOP POINT.</xsd:documentation>
                </xsd:annotation>
        </xsd:element>
        <xsd:complexType name="ScheduledStopPointRefStructure">
                <xsd:annotation>
                        <xsd:documentation>Type for a reference to a SCHEDULED STOP POINT.</xsd:documentation>
                </xsd:annotation>
                <xsd:simpleContent>
                        <xsd:restriction base="TimingPointRefStructure">
                                <xsd:attribute name="ref" type="ScheduledStopPointIdType" use="required">
                                        <xsd:annotation>
                                                <xsd:documentation>Identifier of SCHEDULED STOP POINT.</xsd:documentation>
                                        </xsd:annotation>
                                </xsd:attribute>
                        </xsd:restriction>
                </xsd:simpleContent>
        </xsd:complexType>
    scheduled_stop_point_ref: Optional[
        Union[FareScheduledStopPointRef, ScheduledStopPointRef]
    ] = field(
        default=None,
        metadata={
            "type": "Elements",
            "choices": (
                {
                    "name": "FareScheduledStopPointRef",
                    "type": FareScheduledStopPointRef,
                    "namespace": "http://www.netex.org.uk/netex",
                },
                {
                    "name": "ScheduledStopPointRef",
                    "type": ScheduledStopPointRef,
                    "namespace": "http://www.netex.org.uk/netex",
                },
            ),
        },
    )
    onward_timing_link_ref: Optional[TimingLinkRefStructure] = field(
        default=None,
        metadata={
            "name": "OnwardTimingLinkRef",
            "type": "Element",
            "namespace": "http://www.netex.org.uk/netex",
        },
    )
tefra commented 6 months ago

I don't follow @skinkie 🥲

skinkie commented 6 months ago

scheduled_stop_point_ref becomes FareScheduledStopPointRef | ScheduledStopPointRef onward_timing_link_ref becomes TimingLinkRefStructure

I would like to see onward_timing_link_ref as TimingLinkRef.

I don't see what in the XML Schema causes this distinction. Why wouldn't scheduled_stop_point_ref become FareScheduledStopPointRefStructure | ScheduledStopPointRefStructure?

tefra commented 6 months ago

I don't understand the comparison, in both cases the generator behaves exactly the same, which is avoid flattening global types (root level elements and complex types.)

scheduled_stop_point_ref gets ScheduledStopPointRef from <xsd:element ref="ScheduledStopPointRef"/>

The scheduled_stop_point_ref also gets FareScheduledStopPointRef from

<xsd:element name="FareScheduledStopPointRef" type="FareScheduledStopPointRefStructure" substitutionGroup="ScheduledStopPointRef">

onward_timing_link_ref gets OnwardTimingLinkRef from <xsd:element name="OnwardTimingLinkRef" type="TimingLinkRefStructure" minOccurs="0">

skinkie commented 6 months ago

I don't understand the comparison, in both cases the generator behaves exactly the same, which is avoid flattening global types (root level elements and complex types.)

scheduled_stop_point_ref gets ScheduledStopPointRef from <xsd:element ref="ScheduledStopPointRef"/>

So you say that if

                        <xsd:element name="OnwardTimingLinkRef" type="TimingLinkRefStructure" minOccurs="0">
                                <xsd:annotation>
                                        <xsd:documentation>Onward link - used to disambiguate if there are multiple links from the same stop, e.g. as for cloverleaf route topology. If not given explicitly assume there is only one link that connects the two.</xsd:documentation>
                                </xsd:annotation>
                        </xsd:element>  

would be replaced by:

                        <xsd:element ref="OnwardTimingLinkRef" />  

        <xsd:element name="OnwardTimingLinkRef" type="ScheduledStopPointRefStructure" substitutionGroup="TimingPointRef">
                <xsd:annotation>
                        <xsd:documentation>Onward link - used to disambiguate if there are multiple links from the same stop, e.g. as for cloverleaf route topology. If not given explicitly assume there is only one link that connects the two.</xsd:documentation>
                </xsd:annotation>
        </xsd:element>

It would make TimingLinkRef (as type) out of it?

My ideal would be that the TimingLinkRef is used, I noticed that also xmlspy and likes take the RefStructure, so it is likely something in the schema. The user must use TimingLinkRef (and not TimingLinkRefStructure) as type.

(I corrected some typo's in this post)

tefra commented 6 months ago

No I didn't say that 😄 xsdata can't go around flattening global types, that's what it does in both cases. We can't replace the child types with parent types just because, technically we could, but that would force the serializer to add the xsi:type to reveal the real type in order to pass schema validations.

The whole structure is maddening, all that for no good reason, the restriction basically does nothing, except creating a lot of clones.

    <xsd:element name="TimingLinkRef" type="TimingLinkRefStructure" substitutionGroup="LinkRef" />
    <xsd:complexType name="TimingLinkRefStructure">
        <xsd:simpleContent>
            <xsd:restriction base="LinkRefStructure">
                <xsd:attribute name="ref" type="TimingLinkIdType" use="required"/>
            </xsd:restriction>
        </xsd:simpleContent>
    </xsd:complexType>
    <xsd:complexType name="LinkRefStructure">
        <xsd:simpleContent>
            <xsd:restriction base="VersionOfObjectRefStructure">
                <xsd:attribute name="ref" type="LinkIdType" use="required">
                    <xsd:annotation>
                        <xsd:documentation>Identifier of a LINK.</xsd:documentation>
                    </xsd:annotation>
                </xsd:attribute>
            </xsd:restriction>
        </xsd:simpleContent>
    </xsd:complexType>
    <xsd:simpleType name="LinkIdType">
        <xsd:restriction base="ObjectIdType"/>
    </xsd:simpleType>
    <xsd:complexType name="VersionOfObjectRefStructure">
        <xsd:annotation>
            <xsd:documentation>Type for a versioned reference to a NeTEx Object.</xsd:documentation>
        </xsd:annotation>
        <xsd:simpleContent>
            <xsd:extension base="ObjectIdType">
                ...
                <xsd:attribute name="ref" type="ObjectIdType" use="required" />
skinkie commented 6 months ago

No I didn't say that 😄 xsdata can't go around flattening global types, that's what it does in both cases. We can't replace the child types with parent types just because, technically we could, but that would force the serializer to add the xsi:type to reveal the real type in order to pass schema validations.

But in this case TimingLinkRefStructure is the parent of TimingLinkRef? The problem is indeed that if the user wrongfully uses TimingLinkRefStructure (like xsdata suggests) it ends up with the xsi:type (so it is also to avoid user error).

The whole structure is maddening, all that for no good reason, the restriction basically does nothing, except creating a lot of clones.

I think the only reason the structure does this, is to create an unique IdType.

tefra commented 6 months ago

Can you show me an example, because I don't see that in the samples @skinkie,

TimingLinkRefStructure is the child not the parent

skinkie commented 6 months ago
from xsdata.formats.dataclass.serializers import XmlSerializer
from xsdata.formats.dataclass.serializers.config import SerializerConfig
from netex import StopPointInJourneyPattern, TimingLinkRefStructure

spijp = StopPointInJourneyPattern(onward_timing_link_ref=TimingLinkRefStructure(ref="test")) # This is based on type hinting in PyCharm.

ns_map={'': 'http://www.netex.org.uk/netex', 'gml': 'http://www.opengis.net/gml/3.2'}

serializer_config = SerializerConfig(ignore_default_attributes=True)
serializer_config.pretty_print = True
serializer_config.ignore_default_attributes = True
serializer = XmlSerializer(serializer_config)

print(serializer.render(spijp, ns_map))

It is not that bad in Python it seems :-)

<?xml version="1.0" encoding="UTF-8"?>
<StopPointInJourneyPattern xmlns="http://www.netex.org.uk/netex" xmlns:gml="http://www.opengis.net/gml/3.2">
  <OnwardTimingLinkRef ref="test"/>
</StopPointInJourneyPattern>

But considering that we want people to use TimingLinkRef (and not TimingLinkRefStructure) the question is: what to do?

skinkie commented 6 months ago

TimingLinkRefStructure is the child not the parent

I would say no?

@dataclass(kw_only=True)
class TimingLinkRef(TimingLinkRefStructure):
    class Meta:
        namespace = "http://www.netex.org.uk/netex"
tefra commented 6 months ago

I am sorry I was comparing this TimingLinkRefStructure(LinkRefStructure) , netex is not the simplest suite to follow

But considering that we want people to use TimingLinkRef (and not TimingLinkRefStructure) the question is: what to do?

I don't know 😄, update the schema? I can't help with that, why do you even want that, to generate the xsi:type?

skinkie commented 6 months ago

But considering that we want people to use TimingLinkRef (and not TimingLinkRefStructure) the question is: what to do?

I don't know 😄, update the schema? I can't help with that, why do you even want that, to generate the xsi:type?

I don't want to have the xsi:type. I would like to see the typehint TimingLinkRef. So someone would naively do this:

spijp = StopPointInJourneyPattern(onward_timing_link_ref=TimingLinkRef(ref="test"))
tefra commented 6 months ago

Ok so you want all the children of the TimingLinkRefStructure to appear in the type hint

TimingLinkRef
ServiceLinkRefStructure
ServiceLinkRef

Understandable, but very difficult to accomplice @skinkie as far as I know no-other schema code generator does this...

skinkie commented 6 months ago

Ok so you want all the children of the TimingLinkRefStructure to appear in the type hint

It is clear you look at least 3 steps ahead of my question.

Understandable, but very difficult to accomplice @skinkie as far as I know no-other schema code generator does this...

To be fair. xsData is the best schema generator ever built. I wonder why it took 20 years to appear ;-)

skinkie commented 6 months ago

I think for this specific question the OnwardTimingLink is not a 'substitution group kind of thing', and it should not do what I ask. I'll come back with a specific example iff there exists one.