getodk / web-forms

ODK Web Forms enables form filling and submission editing of ODK forms in a web browser. It's coming soon! ✨
https://getodk.org
Apache License 2.0
9 stars 8 forks source link

`id` function is not compliant to the spec #176

Open sadiqkhoja opened 3 months ago

sadiqkhoja commented 3 months ago

As per the spec, id function selects elements by their unique ID and attribute for unique ID is declared using DTD.

Q. If no DTD is defined then is id attribute implicitly considered as unique ID of the element node?

Following test fails, because we are not handling DTD at this moment, right?

it.only('adhoc tests', () => {
    const testDocument = xml`<?xml version="1.0" encoding="UTF-8"?>
            <!DOCTYPE root [
                    <!ELEMENT root (foo+)>
                    <!ELEMENT foo (bar)>
                    <!ELEMENT bar (#PCDATA)>
                    <!ATTLIST foo name ID #REQUIRED>
            ]>
            <root>
                    <foo name="first">
                            <bar>Some content</bar>
                    </foo>
                    <foo name="second">
                            <bar>Another content</bar>
                    </foo>
            </root>
`;
    const evaluator = new Evaluator({rootNode: testDocument});

    const actual = evaluator.evaluate(`count(id('first second'))`);

    expect(actual.numberValue).toBe(2);
})

This issue probably just needs documentation.

eyelidlessness commented 3 months ago

I've inferred that "the spec" above refers to XPath 1.0 (id, 5.2.1 Unique IDs).

Pertinent spec language > The [id](https://www.w3.org/TR/1999/REC-xpath-19991116/#function-id) function selects elements by their unique ID (see [[5.2.1 Unique IDs]](https://www.w3.org/TR/1999/REC-xpath-19991116/#unique-id)). [...] > > [...] > > #### 5.2.1 Unique IDs > > An element node may have a unique identifier (ID). This is the value of the attribute that is declared in the DTD as type `ID`. No two elements in a document may have the same unique ID. If an XML processor reports two elements in a document as having the same unique ID (which is possible only if the document is invalid) then the second element in document order must be treated as not having a unique ID. > > > **NOTE:** If a document does not have a DTD, then no element in the document will have a unique ID.

A few notes:

Q. If no DTD is defined then is id attribute implicitly considered as unique ID of the element node?

This is the current behavior, and as mentioned it is consistent with all of the major browsers. I believe the behavior is expected for HTML documents, and any non-HTML consideration is just unaddressed (I'd expect it never will be).

we are not handling DTD at this moment, right?

Correct.

I was curious if there is any precedent for DTD support or interest in it. Not that it's an exhaustive search, but this issue is the first to mention DTD across all of the getodk org.