Closed hlapp closed 8 years ago
Right, but these errors come straight from the nexml.org validator, and I get the same errors when I just upload the file to the browser, so this is not an RNeXML issue.
Incidentally, I'm getting validation errors from the nexml.org validator when I try with no namespaces on the elements as well (just as we discussed), even though I would have thought the default namespace would be inferred. @rvosa did I misunderstand something here? Possibly this is a bug on the validator?
These errors are not mysterious at all: XML identifiers (that is, id attributes that can be referenced elsewhere in a document) have to be "non-colonized names" (NCName). URIs are not suited for this: they contain colons, as well as other characters that (AFAIK) are also not allowed under the production rules for NCNames (namely, the forward slashes).
On Wed, Sep 30, 2015 at 11:37 PM, Hilmar Lapp notifications@github.com wrote:
Here's the log:
nexml_validate("./inst/examples/test_original.xml") [1] FALSEWarning message:In nexml_validate("./inst/examples/test_original.xml") : Validation failed, error messages: 'http://purl.obolibrary.org/obo/VTO_0036225' is not a valid xml NCName for Bio::Phylo::Taxa::Taxon=SCALAR(0x1ce5440) Validation failed, error messages: 'http://purl.obolibrary.org/obo/VTO_0036225' is not a valid xml NCName for Bio::Phylo::Taxa::Taxon=SCALAR(0x1ce5440)
The file https://github.com/xu-hong/rphenoscape/blob/master/inst/examples/test_original.xml is the original NeXML returned by the Phenocape API.
One thing that might come into play here is the use of HTTP URIs as local identifiers. I have filed issue phenoscape/phenoscape-kb-services#15 https://github.com/phenoscape/phenoscape-kb-services/issues/15 for whether this is on purpose, and what motivates it.
— Reply to this email directly or view it on GitHub https://github.com/ropensci/RNeXML/issues/128.
Incidentally, I'm getting validation errors from the nexml.org validator when I try with no namespaces on the elements as well (just as we discussed), even though I would have thought the default namespace would be inferred. @rvosa https://github.com/rvosa did I misunderstand something here? Possibly this is a bug on the validator?
It is impossible to say without seeing the input file and the log. Apart from the integrity checks about whether the right blocks are referring to each other (which you can't really express in XML Schema) the validation is for the most part a totally generic XML Schema validation that involved essentially zero coding on my end, so the scope for bugs there is probably limited.
The XML isn't valid. See @balhoff's comments on phenoscape/phenoscape-kb-services#15
@rvosa @hlapp @balhoff Thanks, that makes perfect sense in the case of the phenoscape example.
I'm still a little confused by the validation with respect to having namespace prefixes likenex:
on the values of attributes (particularly the value of xsi:type
attributes. For instance, this NeXML file is valid by the online validator, and it uses bare xsi:type
values in meta elements, e.g. it uses:
<meta xsi:type="LiteralMeta"
instead of
<meta xsi:type="nex:LiteralMeta"
However, when I remove the nex:
prefixes from this other valid NeXML character xsi:type
values, it stops being valid. Why? Why isn't the top level namespace inferred automatically?
If I understood from recent discussion, we felt that it was best to ignore these prefixes until we could expand them properly, and when generating XML to omit them for compatibility. Maybe I got that wrong.
On Thu, Oct 1, 2015 at 4:42 PM, Carl Boettiger notifications@github.com wrote:
@rvosa https://github.com/rvosa @hlapp https://github.com/hlapp @balhoff https://github.com/balhoff Thanks, that makes perfect sense in the case of the phenoscape example.
I'm still a little confused by the validation with respect to having namespace prefixes likenex: on the values of attributes (particularly the value of xsi:type attributes. For instance, this NeXML file https://github.com/ropensci/RNeXML/blob/master/inst/examples/meta_example.xml is valid by the online validator, and it uses bare xsi:type values in meta elements, e.g. it uses:
<meta xsi:type="LiteralMeta"
instead of
<meta xsi:type="nex:LiteralMeta"
However, when I remove the nex: prefixes from this other valid NeXML https://github.com/ropensci/RNeXML/blob/master/inst/examples/characters.xml character xsi:type values, it stops being valid. Why? Why isn't the top level namespace inferred automatically?
I wonder what would happen if you removed the xsi:schemaLocation attribute from the file that fails if the xsi:type is not fully qualified. In your former case (the file that succeeds whether or not there is a prefix) we don't actually say anywhere explicitly where the schema is located - though the validator knows, on the basis of the namespace URI. In the latter, we do give a schema location. Perhaps the validator tries (and fails) to do something with that in the case of the default namespace?
@hlapp XML coming out of OntoTrace should validate completely now (since 2015-10-5). Please let me know if you encounter any problems.
Should be fixed by PR #133
Here's the log:
The file is the original NeXML returned by the Phenocape API.
One thing that might come into play here is the use of HTTP URIs as local identifiers. I have filed issue phenoscape/phenoscape-kb-services#15 for whether this is on purpose, and what motivates it.