sbgn / libsbgn

Libraries for the Systems Biology Graphical Notation (SBGN); Java and C++
Other
15 stars 8 forks source link

Proposal to allow double colon (i.e. "::") to be used inside a glyph id #47

Closed ibalaur closed 5 years ago

ibalaur commented 5 years ago

Currently, there is a validation rule that does not accept the double colon inside the glyph id; for example: cvc-datatype-valid.1.2.1: 'n3::n0' is not a valid value for 'NCName'. However, Newt reads and accepts such formats. As an enhacement suggestion, could the current library be extended to accept such format?

Input example: glyph id="n5::n1" class="macromolecule multimer" glyph id="n5::n1::0" class="unit of information"

cannin commented 5 years ago

Here is my understanding of the error. This is an XML error, not an SBGN error. An SBGNML file must first be a valid XML file and these are checks we did not write (the error you identified). Conceptually, I do not mind the change you suggest, but I am not supportive of turning off this base level of schema checking completely and I do not know how to configure them. It is likely to prevent multiple XML namespaces (a valid concern).

I assume what is happening is that we have in the schema that identifiers are forced to be xsd:IDs

https://github.com/sbgn/libsbgn/blob/be7b0ab902bbf8fc5cffbc792da638c46e22ca88/resources/SBGN.xsd#L350

which in turn must be NCName valid:

http://books.xmlschemata.org/relaxng/ch19-77151.html http://books.xmlschemata.org/relaxng/ch19-77215.html

@ugurdogrusoz I'm not sure about what might be going on with Newt validation. I do not know a way forward, and maybe it should remain as is. If you see other options, let us know.

ugurdogrusoz commented 5 years ago

Newt uses libSBGN.js, so this is an issue of the associated library, not of Newt. I am not sure how @royludo handled these identifiers and managed to accept double columns.

fbergmann commented 5 years ago

for now libSBGN will let you read / write documents containing invalid identifiers. But i think the validation is correct about issuing a validation issue.

ibalaur commented 5 years ago

Thank you All for replies. @ugurdogrusoz , I believe @royludo replaced such characters with e.g. double dashes/ underscores etc. as I did for getting valid SBGN files after conversion from e.g. yEd (which allows double colon inside names). However, I understand now that this format can cause issues, so it should not be further considered in libSBGN, but to be managed by applications themselves when it appears.