Open sthibaul opened 8 years ago
Of course github mangled the xml code... Here are the files attached
(I had to append .txt extensions for github to be happy...)
ID checking is defined as part of the Relax NG DTD compatibility specification and being something inherited from DTDs, their definition should be consistent for an element matched by a Relax NG pattern. Usually the problem appears when the same content can be matched by a wildcard-like pattern (any element with any content, with any attribute etc.) because the same attribute will be considered with no ID type by the wildcard-like pattern and with ID type by the more concrete pattern that defines the element and the attribute. If you look in docbook.rng at the indicated line you will see that there is a wildcard-like pattern there and that will match xml:id with no ID type while you defined that to be an ID in your schema, thus the error. You can turn off ID checking in Jing - look for the available options - or you can change the schema to avoid this problem. One possibility is to change the any pattern to exclude xml:id and match that explicitly as an ID, something like below
<define name="db._any.attribute">
<choice>
<attribute>
<a:documentation>Any attribute including in any attribute in any
namespace.</a:documentation>
<anyName>
<except>
<name>xml:id</name>
</except>
</anyName>
</attribute>
<attribute name="xml:id">
<data type="ID"/>
</attribute>
</choice>
</define>
Regards, George
But db._any isn't used anywhere in the XML file. There may be an error in the DocBook 5 schema (for instance, if an XML file uses xml:id somewhere in MathML contents as a descendant of a DocBook element, the user wouldn't get what he may expect), but here xml:id doesn't appear as a descendant of a DocBook element, so that db._any isn't used and there shouldn't be any error.
The thing is that it is possible in an instance document to appear a "root" element in that area and in that case the processor will not know how to consider the ID type for the xml:id attribute - this is a static error, that analyses the schema, not a runtime error on a specific instance document. I mentioned the options above - one is to turn off ID checking.
Best Regards, George
If a "root" element appears in db._any, then the xml:id attribute would have type text in this context, because this is what the grammar says. Consider the following XML file:
<?xml version="1.0" encoding="utf-8"?>
<root xmlns="http://localhost/" xml:id="foo">
<para xmlns="http://docbook.org/ns/docbook" linkend="foo">
<inlineequation>
<foo xmlns="http://www.w3.org/1998/Math/MathML">
<root xmlns="http://localhost/" xml:id="bar"/>
</foo>
</inlineequation>
</para>
</root>
Here, the xml:id="foo" would be of type ID because one has start = element root { attribute xml:id { xsd:ID }, db.para }
. However the other "root" element is part of db._any, with db._any = element * - (db:* | html:*) { (db._any.attribute | text | db._any)* }
and db._any.attribute = attribute * { text }
, so that the type of xml:id="bar" would be text.
That said, the "xml:" namespace is special, as it is standard. https://www.w3.org/XML/1998/namespace says: "The xml:id specification defines a single attribute, xml:id, known to be of type ID independently of any DTD or schema." Note the "independently". So, because of this, xml:id="bar" is of type ID. This is how libxml2 behaves (I've checked, replacing linkend="foo" by linkend="bar"). That's probably why the DocBook 5 schema doesn't exclude xml:*
in db._any.
And note that ID checking is useful, I don't want to turn it off. Currently, jing cannot work with any serious schema that mixes DocBook 5 and another namespace.
I agree that this is one of the major pain points with Relax NG, but I do not know the best way forward...
Ideally, wildard names like anyName or nsName should not contribute to identifying the ID type and ID type assignment should be done only using the information from elements/attributes specified with specific names. A similar issue appears for DITA 1.3 which uses Relax NG as normative schema and there we need to exclude some element names to get the schemas working.
Maybe the best solution will be an update to the DTD compatibility spec http://relaxng.org/compatibility-20011203.html#id to say that anyName
and nsName
name classes should not be considered when we check if two element/attribute to ID type mappings compete and then we can follow with updating Jing accordingly.
Maybe @jclark can share some insight on this.
Regards, George
If this can be useful in tests, here are two standalone examples.
<?xml version="1.0" encoding="utf-8"?>
<ex1>
<foo xml:id="a">
<bar ref="a b">
<foo xml:id="b"/>
</bar>
</foo>
</ex1>
The corresponding schema:
start =
element ex1 {
element foo {
attribute xml:id { xsd:ID }?,
element bar {
attribute ref { xsd:IDREFS }?,
element foo {
attribute xml:id { text }?
}
}
}
}
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
<start>
<element name="ex1">
<element name="foo">
<optional>
<attribute name="xml:id">
<data type="ID"/>
</attribute>
</optional>
<element name="bar">
<optional>
<attribute name="ref">
<data type="IDREFS"/>
</attribute>
</optional>
<element name="foo">
<optional>
<attribute name="xml:id"/>
</optional>
</element>
</element>
</element>
</element>
</start>
</grammar>
For this first example, according to the "xml:" namespace specifications, xml:id is always of type ID (what's inside attribute xml:id { }
should be ignored). For this reason, I don't think there is a DTD compatibility issue concerning this example (this will be different in the second example, which I assume is less common).
<?xml version="1.0" encoding="utf-8"?>
<ex2>
<foo myid="a">
<bar ref="a">
<foo myid="b"/>
</bar>
</foo>
</ex2>
The corresponding schema:
start =
element ex2 {
element foo {
attribute myid { xsd:ID }?,
element bar {
attribute ref { xsd:IDREFS }?,
element foo {
attribute myid { text }?
}
}
}
}
<?xml version="1.0" encoding="UTF-8"?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
<start>
<element name="ex2">
<element name="foo">
<optional>
<attribute name="myid">
<data type="ID"/>
</attribute>
</optional>
<element name="bar">
<optional>
<attribute name="ref">
<data type="IDREFS"/>
</attribute>
</optional>
<element name="foo">
<optional>
<attribute name="myid"/>
</optional>
</element>
</element>
</element>
</element>
</start>
</grammar>
Here, I've replaced the standard xml:id
by myid
. So, the first myid instance myid="a"
is of type ID, but not the second myid instance myid="b"
(the validation should fail if ref="a"
is replaced by ref="b"
). Again, libxml2 behaves that way.
Please note that this functionality is part of the DTD compatibility specification, and that means you cannot have two different declarations for the same attribute in the same element, because you cannot have that in a DTD.
I suggest two possibilities:
Is this an outright bug that we ideally should fix in the sources? Or rather if it’s more of an enhancement request?
Well, it looks like a bug: the tool is saying the tdb.xml file is invalid while it is valid
The conflicting ID type error is not reported on the XML document, it is a problem reported on the schema and it is related to the DTD compatibility spec [1]. The DTD compatibility ID checking is controlled by an option [2], so you can disable that. This check does what the DTD compatibility spec says, so it is not a problem in Jing, if the spec is updated then Jing can follow the updated spec.
[1] https://www.oasis-open.org/committees/relax-ng/compatibility-20011203.html#id
if its attribute parent has any competing attribute elements, then each such competing attribute element has a data or value child specifying a datatype associated with the same ID-type. Two attribute elements <attribute> nc1 p1 </attribute> and <attribute> nc2 p2 </attribute> compete if and only if the containing definitions compete and there is a name n that belongs to both nc1 and nc2. Note that a definition competes with itself.
[2] http://www.thaiopensource.com/relaxng/jing.html
-i Disables checking of ID/IDREF/IDREFS. By default, Jing enforces the constraints imposed by RELAX NG DTD Compatibility with respect to ID/IDREF/IDREFS.
The -i
option is not OK, since I still want ID checking. Compare with xmllint --relaxng
, for instance.
Hello,
As reported by Vincent Lefevre in Debian bug report http://bugs.debian.org/834555 :
“ jing yields an error on a valid XML file (neither xmllint, nor Emacs nXML complain).
Consider the following files:
==> tdb.xml <== <?xml version="1.0" encoding="utf-8"?>
==> tdb.rnc <== default namespace = "http://localhost/"
include "/usr/share/xml/docbook/schema/rng/5.0/docbook.rnc" { start |= notAllowed }
root = element root { attribute xml:id { xsd:ID }, db.para }
start = root
==> tdb.rng <== <?xml version="1.0" encoding="UTF-8"?> <grammar ns="http://localhost/" xmlns="http://relaxng.org/ns/structure/1.0" +datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
Note: I generated tdb.rng with "trang tdb.rnc tdb.rng" and updated the path to docbook.rng to reuse the schemas from the docbook5-xml package.
I get the following error:
zira:~> jing tdb.rng tdb.xml [warning] /usr/bin/jing: No java runtime was found /usr/share/xml/docbook/schema/rng/5.0/docbook.rng:83:16: error: conflicting ID-types for attribute "id" from +namespace "http://www.w3.org/XML/1998/namespace" of element "root" from namespace "http://localhost/"
while with xmllint from libxml2-utils:
zira:~> xmllint --noout --relaxng tdb.rng tdb.xml tdb.xml validates
and when I open tdb.xml in Emacs, it is said:
-UUU:----F1 tdb.xml All L1 (nXML Valid) -------------- Using schema ~/tdb.rnc ”