nexusformat / definitions

Definitions of the NeXus Standard File Structure and Contents
https://manual.nexusformat.org/
Other
26 stars 57 forks source link

XML namespace errors #1031

Open woutdenolf opened 2 years ago

woutdenolf commented 2 years ago

I'm using the XML language plugin by Red Hat in VScode

I'm getting these errors:

image

I'm not an XML expert but the file http://definition.nexusformat.org/nxdl/nxdlTypes.xsd is missing. Perhaps that causes some of the errors?

prjemian commented 2 years ago

It's here: https://github.com/nexusformat/definitions/blob/0592a9881d63d165739b6fa554b0d0ecc3ff4b7a/nxdl.xsd#L35-L42

How is this tool trying to find it? Perhaps assuming the targetNamespace URI is a valid URL? https://github.com/nexusformat/definitions/blob/0592a9881d63d165739b6fa554b0d0ecc3ff4b7a/nxdl.xsd#L4

It is not required for the targetNamespace to be a valid web address, or even formatted as such, but it is customary for the formatting. I'm guessing this tool makes the assumption that the nxdlTypes.xsd file will be served from this web address. In our case, targetNamespace is a URI that is not a URL.

For reference about targetNameSpace, this SO post has helpful explanations. The W3schools has reference documentation on [targetNamespace(https://www.w3schools.com/xml/el_schema.asp) and the related xmlns.

prjemian commented 2 years ago

W3schools provides documentation for the XML <xs:include> element (where xs is the namespace of the XML Schema for XSD) and the xs symbol is declared here: https://github.com/nexusformat/definitions/blob/0592a9881d63d165739b6fa554b0d0ecc3ff4b7a/nxdl.xsd#L3

Note: The re-use of terms here is really mind numbing.

woutdenolf commented 2 years ago

The tool gets the namespace information from http://definition.nexusformat.org/nxdl/nxdl.xsd but the file nxdlTypes.xsd does not exist there...

prjemian commented 2 years ago

Note that since targetNamespace is not a URL (in our case), then attempts to access URL http://definition.nexusformat.org/nxdl/nxdlTypes.xsd give a 404 (not found) error, as expected. Nothing to fix here other than the assumption that a URI must always be a working URL.

prjemian commented 2 years ago

As discussed in #92 and answered in #835. Also referenced in #212.

woutdenolf commented 2 years ago

So this means I can only work on new nxdl.xml files inside the definitions repo and not anywhere else?

woutdenolf commented 2 years ago

The tool gets the namespace from

xsi:schemaLocation="http://definition.nexusformat.org/nxdl/3.1 ../nxdl.xsd"

which I had to change to this to work

xsi:schemaLocation="http://definition.nexusformat.org/nxdl/3.1 http://definition.nexusformat.org/nxdl/nxdl.xsd"

It finds it but gives lots of errors because "nxdlTypes.xsd" is not available in "http://definition.nexusformat.org/nxdl/" I presume.

woutdenolf commented 2 years ago

So this means I can only work on new nxdl.xml files inside the definitions repo and not anywhere else?

Isn't the idea of publishing an xml namespace that it can be used by validation tools everywhere?

prjemian commented 2 years ago

So this means I can only work on new nxdl.xml files inside the definitions repo and not anywhere else?

If you want to use such automation tools that assist you in editing and checking validity of NXDL (.nxdl.xml) files, then the answer is Yes. Inside a branch or fork.

prjemian commented 2 years ago

Isn't the idea of publishing an xml namespace that it can be used by validation tools everywhere?

Pick up the previous discussion: https://github.com/nexusformat/definitions/issues/835#issuecomment-712943011

Propose the publishing of this URI as an actual URL in a new issue.

woutdenolf commented 2 years ago

Having a URL for nxdl.xsd but not for nxdlTypes.xsd makes no sense imo. Either you have a URL for all .xsd files or for none.

So yes I could propose that but the .xsd files refer to each other with URI's:

https://github.com/nexusformat/definitions/blob/0592a9881d63d165739b6fa554b0d0ecc3ff4b7a/nxdl.xsd#L35

So I'm not sure this is even going to solve the problem.

woutdenolf commented 2 years ago

https://github.com/nexusformat/definitions/blob/0592a9881d63d165739b6fa554b0d0ecc3ff4b7a/nxdl.xsd#L3

xs is the namespace of the XML Schema for XSD

From this and this description it seems that xs is the prefix of the "http://www.w3.org/2001/XMLSchema" namespace (a.k.a. the XML Schema, XML Schema Definition or XDS namespace). The name of an XML namespace is often a URL to avoid global collisions.

The re-use of terms here is really mind numbing.

Yes I'm losing my mind over here. For example XML Schema is apparently one of the several XML schema languages. That's like having a car brand called Car. Do you have a car, a Car or both?

prjemian commented 2 years ago

RIght: xs is the prefix we define for use in the nxdl.xsd file. The choice of xs follows a convention.

clarify: The http://www.w3.org/2001/XMLSchema namespace is a URI (a formatted string) that just happens to also be a URL (a page available from a web server).

Right: XML Schema is one of several (perhaps ~6 in common use?). They have different strengths. I believe we experimented with schematron as an alternative to better express differences between rules for base classes and application definitions.

Right: If we publish our XML Schema as a URL, we could publish both files (nxdl.xsd and nxdlTypes.xsd).

woutdenolf commented 2 years ago

So for XML validation to work for NXDL files located anywhere we would have the rely on schemaLocation being a URL which means

  1. In all XML files replaces this
"xsi:schemaLocation="http://definition.nexusformat.org/nxdl/3.1 ../nxdl.xsd"

by this

xsi:schemaLocation="http://definition.nexusformat.org/nxdl/3.1 http://definition.nexusformat.org/nxdl/nxdl.xsd"
  1. In nxdl.xsd replace this
<xs:include schemaLocation="nxdlTypes.xsd"> 

by this

<xs:include schemaLocation="http://definition.nexusformat.org/nxdl/nxdlTypes.xsd"> 
  1. Deploy nxdlTypes.xsd to http://definition.nexusformat.org/nxdl/nxdlTypes.xsd (separate issue #1032)

How does the content of http://definition.nexusformat.org/nxdl/ get build and deployed? Perhaps URI to URL substitution can be done in that process? We obviously don't want to do it in the .xsd and .xml files in this repository.

woutdenolf commented 2 years ago

Additional remarks:

So there are some inconsistencies:

The http://definition.nexusformat.org/nxdl/ structure should be http://definition.nexusformat.org/nxdl/{version}/nxdl.xsd # the Nexus definition language http://definition.nexusformat.org/nxdl/{version}/nxdlTypes.xsd # the Nexus definition language http://definition.nexusformat.org/nxdl/{version}/definitions/*.nxdl.xml # the Nexus Standard http://definition.nexusformat.org/nxdl/{version}/manual # the documentation

Edit: the NXDL source links in the manual could then also points to the deployed sources

image

woutdenolf commented 2 years ago

@FreddieAkeroyd Could you comment on how to deploy to http://definition.nexusformat.org/nxdl/ from github actions (if possible)?

prjemian commented 2 years ago

No work on this issue at 2022-06 Code Camp. Is it necessary to resolve this for release of NXDL now?

woutdenolf commented 2 years ago

We can keep it for the next release