xmlet / XsdParser

A Parser that parses a XSD file into a Java Structure.
MIT License
78 stars 35 forks source link

Parent releationship strange (or broken?) #24

Closed hein4daddel closed 4 years ago

hein4daddel commented 4 years ago

Hello, Thank you for this helpfull library. While parsing a (rather complex) schema with the venetian blind pattern, I observed problem with the parent - children relationship of complex types. First a working correct example:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" atributeFormDefault="qualified">
    <xsd:complexType name="CT1">
        <xsd:sequence>
            <xsd:element name="Nr" type="xsd:string" minOccurs="0" maxOccurs="1"/>
        </xsd:sequence>
    </xsd:complexType>
    <xsd:complexType name="CT2">
        <xsd:sequence>
            <xsd:element name="Nr" type="xsd:string" minOccurs="0" maxOccurs="1"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

creates the structure (for the code see below)

----------------------------------------------------------
ComplexType CT1#1209358542
 -> children=XsdElement#479442206 {minOccurs=0, name=Nr, maxOccurs=1, type=xsd:string}
 -> parent=XsdSchema#2081368312 {attributeFormDefault=qualified, xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified, xmlns=nsallg, targetNamespace=nsallg, xmlns:allg=nsallg}
 -> -> children of parent=XsdComplexType#1209358542 {name=CT1}
 -> -> children of parent=XsdComplexType#1182469998 {name=CT2}
 -> grandparent=null
----------------------------------------------------------
ComplexType CT2#1182469998
 -> children=XsdElement#1195473402 {minOccurs=0, name=Nr, maxOccurs=1, type=xsd:string}
 -> parent=XsdSchema#2081368312 {attributeFormDefault=qualified, xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified, xmlns=nsallg, targetNamespace=nsallg, xmlns:allg=nsallg}
 -> -> children of parent=XsdComplexType#1209358542 {name=CT1}
 -> -> children of parent=XsdComplexType#1182469998 {name=CT2}
 -> grandparent=null
----------------------------------------------------------

This is what I expect: a Schema -> ComplexType -> Element structure Now the problem: In the following xsd a complex type is used in an element type.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" atributeFormDefault="qualified">
    <xsd:complexType name="CT1">
        <xsd:sequence>
            <xsd:element name="Nr" type="xsd:string" minOccurs="0" maxOccurs="1"/>
            <xsd:element name="CT2Element" type="CT2" minOccurs="0" maxOccurs="unbounded" />
        </xsd:sequence>
    </xsd:complexType>
    <xsd:complexType name="CT2">
        <xsd:sequence>
            <xsd:element name="Nr" type="xsd:string" minOccurs="0" maxOccurs="1"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

the structure is strange: CT1 is correct, but it looks like CT2 has a russian doll and not a venetian blind structure.

----------------------------------------------------------
ComplexType CT1#1541298091
 -> children=XsdElement#2075981552 {minOccurs=0, name=Nr, maxOccurs=1, type=xsd:string}
 -> children=XsdElement#729867689 {minOccurs=0, name=CT2Element, maxOccurs=unbounded, type=CT2}
 -> parent=XsdSchema#1287658623 {attributeFormDefault=qualified, xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified, xmlns=nsallg, targetNamespace=nsallg, xmlns:allg=nsallg}
 -> -> children of parent=XsdComplexType#1541298091 {name=CT1}
 -> -> children of parent=XsdComplexType#919063521 {name=CT2}
 -> grandparent=null
----------------------------------------------------------
ComplexType CT2#919063521
 -> children=XsdElement#449240381 {minOccurs=0, name=Nr, maxOccurs=1, type=xsd:string}
 -> parent=XsdElement#729867689 {minOccurs=0, name=CT2Element, maxOccurs=unbounded, type=CT2}
 -> grandparent=XsdSequence#1445534206 {}
----------------------------------------------------------

It's getting messy when a complex type is used in the same element in a recursive way:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" atributeFormDefault="qualified">
    <xsd:complexType name="CT1">
        <xsd:sequence>
            <xsd:element name="Nr" type="xsd:string" minOccurs="0" maxOccurs="1"/>
            <xsd:element name="CT1Element" type="CT1" minOccurs="0" maxOccurs="unbounded" />
        </xsd:sequence>
    </xsd:complexType>
    <xsd:complexType name="CT2">
        <xsd:sequence>
            <xsd:element name="Nr" type="xsd:string" minOccurs="0" maxOccurs="1"/>
        </xsd:sequence>
    </xsd:complexType>
</xsd:schema>

the result is:

----------------------------------------------------------
ComplexType CT1#628402659
 -> children=XsdElement#1071550332 {minOccurs=0, name=Nr, maxOccurs=1, type=xsd:string}
 -> children=XsdElement#299395997 {minOccurs=0, name=CT1Element, maxOccurs=unbounded, type=CT1}
 -> parent=XsdElement#299395997 {minOccurs=0, name=CT1Element, maxOccurs=unbounded, type=CT1}
 -> grandparent=XsdSequence#288887829 {}
----------------------------------------------------------
ComplexType CT2#1524026401
 -> children=XsdElement#134271077 {minOccurs=0, name=Nr, maxOccurs=1, type=xsd:string}
 -> parent=XsdSchema#1762731246 {attributeFormDefault=qualified, xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified, xmlns=nsallg, targetNamespace=nsallg, xmlns:allg=nsallg}
 -> -> children of parent=XsdComplexType#628402659 {name=CT1}
 -> -> children of parent=XsdComplexType#1524026401 {name=CT2}
 -> grandparent=null

You can see a circular reference in CT1 between children ind parent. I don't think it was expected this way.

Here's the source code for the tests:

    public void parseSchema(String schemafile) {
        XsdParser parser = new XsdParser(schemafile);
        List<XsdSchema> schemas = parser.getResultXsdSchemas().collect(Collectors.toList());
        for (XsdSchema schema : schemas) {
            List<XsdComplexType> cts = schema.getChildrenComplexTypes().collect(Collectors.toList());
            for (XsdComplexType ct : cts) {
                System.out.println("----------------------------------------------------------");
                System.out.println("ComplexType " + ct.getName() + "#" + System.identityHashCode(ct));
                ct.getXsdElements().forEach(e -> System.out.println(" -> children=" + getInfo(e)));
                System.out.println(" -> parent=" + getInfo(ct.getParent()));
                ct.getParent()
                        .getXsdElements()
                        .forEach(e -> System.out.println(" -> -> children of parent=" + getInfo(e)));
                System.out.println(" -> grandparent=" + (ct.getParent() != null ? getInfo(ct.getParent().getParent()) : "null"));
            }
        }
        }
    private String getInfo(XsdAbstractElement xae) {
        if (xae == null) {
            return "null";
        } else {
            StringBuilder sb = new StringBuilder();
            sb.append(xae.getClass().getSimpleName() + "#" + System.identityHashCode(xae) + " " + xae.getAttributesMap());
            return sb.toString();
        }
    }
lcduarte commented 4 years ago

Hello!

Thanks for using the library and the detailed issue. I was wrongfully assingning the XsdElement as parent of the XsdComplexType when the type attribute was being resolved. After fixing the issue I got the following outputs:

First example (the same result as the original):

ComplexType CT1#306115458
 -> children=XsdElement#71399214 {minOccurs=0, name=Nr, maxOccurs=1, type=xsd:string}
 -> parent=XsdSchema#1932831450 {xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified, atributeFormDefault=qualified}
 -> -> Parent Children =XsdComplexType#306115458 {name=CT1}
 -> -> Parent Children =XsdComplexType#1302227152 {name=CT2}
 -> grandparent=null
----------------------------------------------------------
ComplexType CT2#1302227152
 -> children=XsdElement#1122606666 {minOccurs=0, name=Nr, maxOccurs=1, type=xsd:string}
 -> parent=XsdSchema#1932831450 {xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified, atributeFormDefault=qualified}
 -> -> Parent Children =XsdComplexType#306115458 {name=CT1}
 -> -> Parent Children =XsdComplexType#1302227152 {name=CT2}
 -> grandparent=null

Second Example:

Original Result:

ComplexType CT1#306115458
 -> children=XsdElement#71399214 {minOccurs=0, name=Nr1, maxOccurs=1, type=xsd:string}
 -> children=XsdElement#1932831450 {minOccurs=0, name=CT2Element, maxOccurs=unbounded, type=CT2}
 -> parent=XsdSchema#496729294 {xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified}
 -> -> Parent Children =XsdComplexType#306115458 {name=CT1}
 -> -> Parent Children =XsdComplexType#1122606666 {name=CT2}
 -> grandparent=null
----------------------------------------------------------
ComplexType CT2#1122606666
 -> children=XsdElement#350068407 {minOccurs=0, name=Nr2, maxOccurs=1, type=xsd:string}
 -> parent=XsdElement#1932831450 {minOccurs=0, name=CT2Element, maxOccurs=unbounded, type=CT2}
 -> grandparent=XsdSequence#1390869998 {}

After the fix:

ComplexType CT1#1629687658
 -> children=XsdElement#1192171522 {minOccurs=0, name=Nr1, maxOccurs=1, type=xsd:string}
 -> children=XsdElement#1661081225 {minOccurs=0, name=CT2Element, maxOccurs=unbounded, type=CT2}
 -> parent=XsdSchema#1882554559 {xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified}
 -> -> Parent Children =XsdComplexType#1629687658 {name=CT1}
 -> -> Parent Children =XsdComplexType#23211803 {name=CT2}
 -> grandparent=null
----------------------------------------------------------
ComplexType CT2#23211803
 -> children=XsdElement#1923598304 {minOccurs=0, name=Nr2, maxOccurs=1, type=xsd:string}
 -> parent=XsdSchema#1882554559 {xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified}
 -> -> Parent Children =XsdComplexType#1629687658 {name=CT1}
 -> -> Parent Children =XsdComplexType#23211803 {name=CT2}
 -> grandparent=null

The CT2 parent now is the XsdSchema instead of the XsdElement which has the type attribute CT2. Therefor the CT2 parent children are CT1 and CT2.

Third Example (Circular Reference):

Original:

ComplexType CT1#1629687658
 -> children=XsdElement#1192171522 {minOccurs=0, name=Nr1, maxOccurs=1, type=xsd:string}
 -> children=XsdElement#1661081225 {minOccurs=0, name=CT1Element, maxOccurs=unbounded, type=CT1}
 -> parent=XsdElement#1661081225 {minOccurs=0, name=CT1Element, maxOccurs=unbounded, type=CT1}
 -> grandparent=XsdSequence#1049817027 {}
----------------------------------------------------------
ComplexType CT2#23211803
 -> children=XsdElement#1923598304 {minOccurs=0, name=Nr2, maxOccurs=1, type=xsd:string}
 -> parent=XsdSchema#776700275 {xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified}
 -> -> Parent Children =XsdComplexType#1629687658 {name=CT1}
 -> -> Parent Children =XsdComplexType#23211803 {name=CT2}
 -> grandparent=null

After the fix:

ComplexType CT1#1629687658
 -> children=XsdElement#1192171522 {minOccurs=0, name=Nr1, maxOccurs=1, type=xsd:string}
 -> children=XsdElement#1661081225 {minOccurs=0, name=CT1Element, maxOccurs=unbounded, type=CT1}
 -> parent=XsdSchema#1882554559 {xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified, atributeFormDefault=qualified}
 -> -> Parent Children =XsdComplexType#1629687658 {name=CT1}
 -> -> Parent Children =XsdComplexType#23211803 {name=CT2}
 -> grandparent=null
----------------------------------------------------------
ComplexType CT2#23211803
 -> children=XsdElement#1923598304 {minOccurs=0, name=Nr2, maxOccurs=1, type=xsd:string}
 -> parent=XsdSchema#1882554559 {xmlns:xsd=http://www.w3.org/2001/XMLSchema, elementFormDefault=qualified, atributeFormDefault=qualified}
 -> -> Parent Children =XsdComplexType#1629687658 {name=CT1}
 -> -> Parent Children =XsdComplexType#23211803 {name=CT2}
 -> grandparent=null

After the fix the CT1 now has the XsdSchema as a parent instead of the XsdElement which was generating the circular reference.

Does this fix solve what you observed?

hein4daddel commented 4 years ago

Hello, yes, that's what I expected, thanks for the fast response! btw, I observed this problem because I need to know the schema a complex type belongs to. So I just called getParent recursive and hoped to find the schema node. Maybe this would be a nice feature for the XsdAbstractElement: a method getSchema() to obtain the schema of this element.

lcduarte commented 4 years ago

I've performed a new release, (1.0.29), with the fix.

I also added the getSchema method to XsdAbstractElement.

If you can please give me some feedback about this release, I've changed PC and I hope I didn't screw anything with the new deploy configuration. Also if you want leave a star on the repository.

Thanks for your input