Should XPath selecting down from root node / be allowed in assert test XPath expression?

martin-honnen commented 9 months ago

As far as I understand it, the XSD 1.1 specification restricts the allowed XPath expressions, you can only access the subtree of the element/type you are putting an assert(ion) on.

However, it seems, that XmlSchema doesn't implement such restrictions, for instance a schema like e.g. the one shown below, with an XPath like count(/foods/food[@type='fruit']) eq /foods/recon/@fruits, which selects down from the root node / is not rejected, like it seems to be by other XSD 1.1 validators (Xerces and Saxon EE).

Is that an intentional feature of XmlSchema, that it doesn't restrict limitations on XPath expressions?

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning" elementFormDefault="qualified" attributeFormDefault="unqualified" vc:minVersion="1.1">
<xs:element name="food" type="foodType"/>
<xs:complexType name="foodType">
    <xs:sequence>
        <xs:element name="name" type="xs:string"/>
    </xs:sequence>
    <xs:attribute name="type">
        <xs:simpleType>
            <xs:restriction base="xs:string">
                <xs:enumeration value="meat"/>
                <xs:enumeration value="vegetable"/>
                <xs:enumeration value="fruit"/>
            </xs:restriction>
        </xs:simpleType>
    </xs:attribute>
</xs:complexType>
<xs:element name="foods">
    <xs:annotation>
        <xs:documentation>Comment describing your root element</xs:documentation>
    </xs:annotation>
    <xs:complexType>
        <xs:sequence>
            <xs:element ref="food" maxOccurs="unbounded"/>
            <xs:element ref="recon"/>
        </xs:sequence>
        <xs:assert test="count(/foods/food[@type='fruit']) eq /foods/recon/@fruits"/>
    </xs:complexType>
</xs:element>
<xs:element name="recon" type="reconType"/>
<xs:complexType name="reconType">
    <xs:attribute name="fruits" type="xs:integer"/>
    <xs:attribute name="vegetables" type="xs:integer"/>
    <xs:attribute name="meats" type="xs:integer"/>
</xs:complexType>
</xs:schema>

brunato commented 9 months ago

Hi, in the paragraph "G.1.4 Assertions and XPath" the application of XPath on assertions seems to have no restrictions on syntax.

More specific on this the subsection 3.b.i) tells:

i. When assertions on a complex type are evaluated, only the subtree rooted in an element of that type is mapped into the data model instance. References to ancestor elements or other nodes outside the subtree are not illegal but will not be effective.

So no invalidity, only it doesn't select nothing outside the scope.

martin-honnen commented 9 months ago

But appears XmlSchema does apply a rule such as <xs:assert test="count(/foods/food[@type='fruit']) eq /foods/recon/@fruits"/> because a sample like

<foods>
<food type="meat">
    <name>Chicken</name>
</food>
<food type="meat">
    <name>Beef</name>
</food>
<food type="meat">
    <name>Pork</name>
</food>
<food type="fruit">
    <name>Banana</name>
</food>
<food type="fruit">
    <name>Apple</name>
</food>
<food type="vegetable">
    <name>Carrot</name>
</food>
<recon vegetables="1" fruits="2" meats="3"/>
</foods>

is assessed as valid while a sample like

<foods>
<food type="meat">
    <name>Chicken</name>
</food>
<food type="meat">
    <name>Beef</name>
</food>
<food type="meat">
    <name>Pork</name>
</food>
<food type="fruit">
    <name>Banana</name>
</food>
<food type="fruit">
    <name>Apple</name>
</food>
<food type="vegetable">
    <name>Carrot</name>
</food>
<recon vegetables="1" fruits="3" meats="3"/>
</foods>

is assessed as invalid.

brunato commented 9 months ago

Ok, so the subtree is a fragment, not a document, according to their definition in the XDM.

The concept of document is a bit confusing sometimes in ElementTree, e.g.:

>>> import lxml.etree as et
>>> root = et.XML('<root><elem1/><elem2/></root>')
>>> root.xpath('.')
[<Element root at 0x7f9c24dedb80>]
>>> root.xpath('/')
[]
>>> root.xpath('/root')
[<Element root at 0x7f9c24dedb80>]
>>> root.xpath('root')
[]

also if you use an ElementTree instance:

>>> maybe_a_doc = et.ElementTree(root)
>>> maybe_a_doc.getroot()
<Element root at 0x7f9c24dedb80>
>>> maybe_a_doc.xpath('.')
[<Element root at 0x7f9c24dedb80>]
>>> maybe_a_doc.xpath('/')
[]
>>> maybe_a_doc.xpath('/root')
[<Element root at 0x7f9c24dedb80>]

Anyway i could change the behavior to reject absolute expressions or to evaluate them as relative (it might be better to reject, as Xerces and Saxon EE do, to be explicit and so to avoid confusion on that).

brunato commented 9 months ago

Xerces reports the root '/' as incorrect, but as a warning, not an error. I have to try Saxon HE on this (note added: cannot test, SaxonC-EE is needed for having XSD validation).

Also the two tests "d4_3_15ii31" and "d4_3_15ii32" of W3C XML Schema 1.1 test suite, report the schema as valid.

The annotation of these tests says:

<ts:annotation>
    <ts:documentation>"//" returns empty sequence</ts:documentation>
</ts:annotation>

So my preferred option is to generate a warning when the schema instance is parsed. The difference is the XML data will be set as fragment (using elementpath>=4.2.1) and so '/' and '//' will select nothing.

martin-honnen commented 9 months ago

@brunato , thanks, yes, a warning and the change to ensure / and // don't select anything seems fine.

brunato commented 8 months ago

@martin-honnen, a resolution with a warning message and empty select for rooted '/' and '//' is available with the release v3.0.2. thanks

martin-honnen commented 8 months ago

@brunato , thanks.

sissaschool / xmlschema

Should XPath selecting down from root node / be allowed in assert test XPath expression? #386