qiuwei / jing-trang

Automatically exported from code.google.com/p/jing-trang
Other
1 stars 1 forks source link

Discussion about allowed patterns for ID data types (and for NCNames in general) #188

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Let's say I have an XML document which has an ID-type attribute with a 
character with hex code 03DB:

 idAttr="ϛ"

which is validated with Jing using the pattern:

 <data type="ID"/>

The validation reports the attribute value as invalid.

Specs for ID type in RNG schemas here:

http://relaxng.org/compatibility-20011203.html#id

defines it to to be the same ID defined in the XML Schema data types:

http://www.w3.org/TR/xmlschema-2/#ID

which binds it directly to the ID attribute type specification from a XML 1.0 
working draft second edition specification.
The XML 1.0 working draft binds the ID attribute type to a Name production:

www.w3.org/TR/2000/WD-xml-2e-20000814#NT-Name

This particular character has the hex code #x03DB which indeed does not seem to 
fit anywhere in the base char production:

..... [#x03D0-#x03D6] | #x03DA | #x03DC | #x03DE | #x03E0.......

The thing is that the final XML 1.0 specs allows this character to appear in 
name start and name chars:

http://www.w3.org/TR/REC-xml/#NT-NameStartChar

So I'm not sure what could be done about this. Could we relax the validation 
from the "com.thaiopensource.xml.util.Naming" class to be compatible with the 
final XML 1.0 specs? Or would this mean not obeying the Relax NG standard as 
the standard points to the XML Schema 1.0 data types standard which points to 
this XML 1.0 working draft?

Original issue reported on code.google.com by raducor...@gmail.com on 22 Oct 2014 at 7:24