sissaschool / xmlschema

XML Schema validator and data conversion library for Python
MIT License
414 stars 74 forks source link

Incorrect Validation Error for Substitution Group with Abstract Head #417

Open StanimirIglev opened 3 weeks ago

StanimirIglev commented 3 weeks ago

Description:

The xmlschema library incorrectly invalidates XML documents that use elements from a substitution group whose head is abstract. According to the W3C XML Schema specifications, substitution group members can appear in the instance document even if the head element is abstract.

To reproduce:

  1. XML Schema (schema.xsd):
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="abstractSubgroupHead" type="xs:integer" abstract="true"/>
    <xs:element name="subgroupMember" type="xs:positiveInteger" substitutionGroup="abstractSubgroupHead"/>
    </xs:schema>
  2. XML document (document.xml):
    <subgroupMember>1</subgroupMember>
  3. Validate:
    xmlschema-validate --schema schema.xsd document.xml -v

Expected behavior:

The XML document should be validated successfully because subgroupMember is a global element; thus, a valid root within the instance document.

Actual behavior:

The xmlschema library reports the following error:

document.xml is not valid
failed validating <Element 'subgroupMember' at 0x7f6285136110> with XsdElement(name='abstractSubgroupHead', occurs=[1, 1]):

Reason: cannot use an abstract element for validation

Schema component:

  <xs:element xmlns:xs="http://www.w3.org/2001/XMLSchema" name="abstractSubgroupHead" type="xs:integer" abstract="true" />

Instance type: <class 'xml.etree.ElementTree.Element'>

Instance:

  <subgroupMember>1</subgroupMember>

Path: /subgroupMember

Environment:

Additional notes:

brunato commented 10 hours ago

Hi, I will insert a fix in the next release of the package.

The fix requires a change of these lines:

https://github.com/sissaschool/xmlschema/blob/059fd3bfb305809d5ebdede1bd68dff9dbaf3348/xmlschema/validators/elements.py#L597-L599

adding a part for delegating the validation/decode to a substitute:

        if self.abstract:
            if self.name not in self.maps.substitution_groups:
                reason = _("can't use an abstract element for validation "
                           "unless it's the head of a substitution group")
                yield self.validation_error(validation, reason, obj, **kwargs)
            else:
                for xsd_element in self.iter_substitutes():
                    if obj.tag == xsd_element.name:
                        yield from xsd_element.iter_decode(obj, validation, **kwargs)
                return

I will extend also encoding part to process abstract elements and substitution (the tag matching is not so simple, because it depends by the convention used for decoded data).

Thank you for the contribution