Closed rclark closed 10 years ago
So I refer to these just as "namespaces", but I suppose that they are URIs for the schemas. We have no choice but to keep those that are currently indicated for the 30+ schemas, as none of the 400+ services would validate if those were changed now.
I agree that the namespace and prefix that we were using is too specific to a segment of the project's data. It would be nice if schemas.usgin.org could manage the namespace/prefix creation for us, but that would really mean that all the namespaces in the current services should be changed; hence versioned and redeployed, so that's not practical.
I know what you mean, but the multiple identifiers is nothing if not super confusing.
From that perspective you could say we have one URI (generated by schemas.usgin.org) and one Namespace (just something we put in the XSD). That's valid, but its confusing.
Oh, and have them different? I understand, that would work. If be real confusing.
That's what's happening right now.
We probably need to discuss f2f, but I think of it this way. The Namespace is an abstract concept for the names defined in the content model. I think the Namespace could be viewed as a representation of the content model, thus have the same URI. The XML schema implements the content model. Elements in the XML schema are scoped to a namespace. The namespace URI in an XML instance can be considered to identify the content model. The xsi:schemaLocation gives a URL that locates an XML schema that implements the content model using that namespace.
steve
After a conversation with @smrAzGS, I think we decided that when new versions of the content models are created, we should build XSD docs that use the schemas.usgin.org URIs as the namespace URI.
Still up in the air whether or not we should provide redirection rules that would "resolve" the existing stategeothermaldata.org namespace URIs.
@smrAzGS @ccaudill @jalisdairi @asonnenschein
We have unresolved content model URI discrepancies that we need to resolve. Here's the story:
We started out by defining what URIs should look like. Importantly, we thought that URIs should be "host-agnostic", meaning that the important part of the identifier was what comes after the
http://my-server-name.something/
. This would allow more than one server to exist on the internet that could be capable of "resolving" URIs.We set up http://resources.usgin.org to resolve the URIs that we were making up. We were pretty happy. We made a bunch of content models, made XSDs for them, and gave them URIs like this one:
http://stategeothermaldata.org/uri-gin/aasg/xmlschema/activefault/1.1
(wait a minute... that won't even resolve anywhere!)
Then we realized that we needed a dedicated system for managing our content models as we spun out new versions. We also wanted a place that people could come to and understand what models we had to offer, so they could find what's appropriate for their data. So, http://schemas.usgin.org/ happened.
One of the things that site intended to do was automatically generate and maintain redirection rules for any schemas that you set up in the system there. So that site includes its own URI redirection engine. When you create a new model, it makes the URI redirection rules for you.
Now remember that host-agnostic part of things? Well... practically speaking that's a bust.
schemas.usgin.org
can only resolve URIs that start withschemas.usgin.org
andresources.usgin.org
can only resolve URIs that state withresources.usgin.org
. That's because if you resolve a URI with another host name, your request never even gets to the server. That's how the internet works.So,
schemas.usgin.org
has to make URIs that start withschemas.usgin.org
. Also, in conversations with Steve, we determined that/aasg/
wasn't appropriate, and/ngds/
made more sense/xmlschema/
was too specific, and/dataschema/
made more sense.So, http://schemas.usgin.org makes URIs like this:
http://schemas.usgin.org/uri-gin/ngds/dataschema/activefault/1.1
The problem is now apparent if you look at the JSON objects that http://schemas.usgin.org spits out. They list the schemas URIs as the URI for the model, while the XSDs all say something else.
What this means is that a system that reads the XSD will think one URI is correct, and a system that reads the JSON will think another is correct. Currently, ckanext-ngds operates against the JSON object, and so it creates metadata records for content-model-aware file uploads that reference
schemas.usgin.org
.I think that the root of the problem is kind of philosophical: There must be one machine-actionable, canonical representation of each content model. Any other representations of the model must be derivatives of that canonical model. We should make sure that we're clear what the real representation is.
Also, we just need to decide: should
schemas.usgin.org
manage URIs for us, or should we ditch that and force ourselves to manage them atresources.usgin.org
?