ISO-TC211 / schemas

Official ISO/TC 211 XML Schemas (input to schemas.isotc211.org)
6 stars 8 forks source link

Schema repository contains non-schema information #37

Open ronaldtse opened 3 years ago

ronaldtse commented 3 years ago

Ideally we only want XSD files inside this repository. Then we can manage annotations inside the XSD schemas, and automatically generate model-driven documentation.

@ejbleys thoughts?

ejbleys commented 3 years ago

Hi Ron wrote

ZIP files Propose to remove them, no one needs them Agreed Ron wrote HTML files that contain description of schemas - these annotations should be subsumed into the XSD files Propose to remove them and use the new automated documentation generation pipeline

Need to see what is presented before can commit

Ron wrote

XSL processing files - these are not schemas Propose to remove them These files should be retained: they are required for transformation between schemas, hence form part of the schemas space Ron wrote Schemas located at: /{number}/{name} instead of /{number}/{part}/{name}/{version}. Can we move them to the proper location?

These files should be retained: they are used by users that have not updated to more recent versions.
    This is an undertaking that TC211 made to its members - old versions will remain available for use.

Yours Evert Evert Bleys 4 Tudor Place HUGHES ACT 2605 Australia +61 (0)2 62811773 +61 (0)411 483 876 @. Skype @.

On 2021-04-15, at 12:37 pm, Ronald Tse @.***> wrote:

ZIP files Propose to remove them, no one needs them HTML files that contain description of schemas - these annotations should be subsumed into the XSD files Propose to remove them and use the new automated documentation generation pipeline XSL processing files - these are not schemas Propose to remove them Schemas located at: /{number}/{name} instead of /{number}/{part}/{name}/{version}. Can we move them to the proper location? @ejbleys https://github.com/ejbleys thoughts?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ISO-TC211/schemas/issues/37, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBGJZ4FD5J7XCYX47YHGL3TIZGN3ANCNFSM426RF36Q.

ronaldtse commented 3 years ago

XSL processing files - these are not schemas These files should be retained: they are required for transformation between schemas, hence form part of the schemas space

XSL files are useful for transforming XML files from one schema to another, but the XSL files themselves are supporting tools, not schemas. In addition, creating documentation for XSL files requires a different process from creating documentation for XSD schema files.

I propose that we create a separate repository to store supporting tools. The location of XSL files are not mentioned in standards, so we are free to place them.

Propose to remove them and use the new automated documentation generation pipeline Need to see what is presented before can commit

Will show. The content of the current HTML files should be subsumed into the schema files annotation elements.

Schemas located at: /{number}/{name} instead of /{number}/{part}/{name}/{version}. These files should be retained: they are used by users that have not updated to more recent versions. This is an undertaking that TC211 made to its members - old versions will remain available for use.

Certainly we don't want to break this commitment.

There are two situations I've seen:

  1. If a standard came from standards.iso.org => i.e. regardless what we do they are already broken, so we have a free hand in placing them at the right place. It is up to standards.iso.org to create the correct redirect.

  2. A location is merely a re-direct. For example, I see a number of duplicated files in the paths "1.0" to "1.0.0" because presumably the new correct place is "1.0.0" but we want to retain a link from "1.0" to it. In this case this is a re-direct and we should not store duplicated files.

I wonder if we should consider "upgrading" the schemas repository to a real ISO 19135-1 register... in the future

ronaldtse commented 3 years ago

@ejbleys On https://schemas.isotc211.org there is now this sample section:

Screenshot 2021-04-15 at 11 53 12 AM

Click on it and you can see how the auto-gen schema documentation looks like:

eg.

Screenshot 2021-04-15 at 11 54 01 AM

and

Screenshot 2021-04-15 at 11 54 14 AM
ejbleys commented 3 years ago

Hi Ron

XSL processing files - these are not schemas These files should be retained: they are required for transformation between schemas, hence form part of the schemas space

XSL files are useful for transforming XML files from one schema to another, but the XSL files themselves are supporting tools, not schemas. In addition, creating documentation for XSL files requires a different process from creating documentation for XSD schema files.

I propose that we create a separate repository to store supporting tools. The location of XSL files are not mentioned in standards, so we are free to place them. Sorry, but XSLTs are within the standard 19115-3 (similar in previous version) “ <> B.3 Additional resources <> The files related to the utilization of codelists that are available for download are found in the directories associated with the packages that define those lists. There are three codelist files associated with those packages: an XML (.xml) encoded using the xml schema from the cat namespace; and an HTML (.html) for presentation to human.

To ease the use of this document, several xml files are available for download in the “resources” directory at https://schemas.isotc211.org/19115/resources https://schemas.isotc211.org/19115/resources. They are organized into the following categories of support: namespaceInformationAndTools; transforms, and codelists.

The namespaceInformationAndTools directory contains an XML file with information describing all the ISO 19115 namespaces, and an XSLT files that convert the XML file into https://schemas.isotc211.org/19115/resources/namespaceInformationAndTools/namespaceSummary.html https://schemas.isotc211.org/19115/resources/namespaceInformationAndTools/namespaceSummary.html for display and use by humans.

Agreed - but needs to parallel schemas (ie not GitHub repository) and a location needs to exist soon to allow document to progress to DIS

Propose to remove them and use the new automated documentation generation pipeline Need to see what is presented before can commit

Will show. The content of the current HTML files should be subsumed into the schema files annotation elements.

OK (the less HTML I have to hand code the better)
But note we need to retain an object at the URL = namespace URI (happy for this to be a redirect to HTML within the schema “directory”) Do you have an example where there are multiple XSDs in a namespace?

There are two situations I've seen:

If a standard came from standards.iso.org => i.e. regardless what we do they are already broken, so we have a free hand in placing them at the right place. It is up to standards.iso.org to create the correct redirect.OK But note we need to retain an object at the URL = namespace URI (happy for this to be a redirect to HTML within the schema “directory”)

Not quite: The newer location has been promoted for two years, so that would require users to re-modify to take the new changes into account

A location is merely a re-direct. For example, I see a number of duplicated files in the paths "1.0" to "1.0.0" because presumably the new correct place is "1.0.0" but we want to retain a link from "1.0" to it. In this case this is a re-direct and we should not store duplicated files.

I suspect 1.0 and 1.0.0 may not be exact match, if they are????? 1.0 should either an older version or a URL = namespace URI construct.

Cheers E

Evert Bleys 4 Tudor Place HUGHES ACT 2605 Australia email: @. Mob: +61 (0)411 483 876 Land: +61 (0)2 6281 1773 Skype: @.

On 2021-04-15, at 1:44 pm, Ronald Tse @.***> wrote:

Schemas located at: /{number}/{name} instead of /{number}/{part}/{name}/{version}.

These files should be retained: they are used by users that have not updated to more recent versions. This is an undertaking that TC211 made to its members - old versions will remain available for use.

Certainly we don't want to break this commitment.

There are two situations I've seen:

If a standard came from standards.iso.org => i.e. regardless what we do they are already broken, so we have a free hand in placing them at the right place. It is up to standards.iso.org to create the correct redirect.

A location is merely a re-direct. For example, I see a number of duplicated files in the paths "1.0" to "1.0.0" because presumably the new correct place is "1.0.0" but we want to retain a link from "1.0" to it. In this case this is a re-direct and we should not store duplicated files.

I wonder if we should consider "upgrading" the schemas repository to a real ISO 19135-1 register... in the future

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ISO-TC211/schemas/issues/37#issuecomment-820040585, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBGJZ44BUEG7NMOKQ5X6IDTIZOJLANCNFSM426RF36Q.

PeterParslow commented 3 years ago

This comment is specific to the codelist files.

I hadn't really noticed that 19115-3 describes them as "files ... available for download". The way they are used - even in the examples within the standard itself - clarifies that this "download" is effectively direct access in location ("download" of the particular fragment?).

In 19115-3 the examples are informative (as they were in 19139:2006), but it certainly the approach which all the implementations that I have seen use - so moving the code lists breaks implementations. Of course, not all implementations do validate the code list references from instance documents - but some do (at least, the UK profile of 19115:2003 and it's open source & online editor implementations!).

It is also good practice - these code list entries are in themselves resources; 19115-3 even describes the code list as a "registry" (recommendation in Table 9)

19139 puts it this way

"The codeList attribute contains a URL that references a codeList definition within a registry or a codelist catalogue." (7.3.5.2 req/codelist/XCT, unchanged since edition 1).

To me, this suggests that the promise made to users not to move the XSDs also applies to the Codelists.

ejbleys commented 3 years ago

Hi Ron & Peter What is the imperative to shift items that directly relate to schemas out of the schemas repository? As Peter highlight we should not be rearranging stuff if possible.

As part of XMG, I am proposing that all future codeListValues reside in a single codelists.xml That file can have a new location (and should be part of a registry) But that is for new version of standards and example XMLs.

Cheers E

Evert Bleys 4 Tudor Place HUGHES ACT Australia Mob: 0411 483 876

On 15 Apr 2021, at 16:16, Peter Parslow @.***> wrote:

 This comment is specific to the codelist files.

I hadn't really noticed that 19115-3 describes them as "files ... available for download". The way they are used - even in the examples within the standard itself - clarifies that this "download" is effectively direct access in location ("download" of the particular fragment?).

In 19115-3 the examples are informative (as they were in 19139:2006), but it certainly the approach which all the implementations that I have seen use - so moving the code lists breaks implementations. Of course, not all implementations do validate the code list references from instance documents - but some do (at least, the UK profile of 19115:2003 and it's open source & online editor implementations!).

It is also good practice - these code list entries are in themselves resources; 19115-3 even describes the code list as a "registry" (recommendation in Table 9)

19139 puts it this way

"The codeList attribute contains a URL that references a codeList definition within a registry or a codelist catalogue." (7.3.5.2 req/codelist/XCT, unchanged since edition 1).

To me, this suggests that the promise made to users not to move the XSDs also applies to the Codelists.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

PeterParslow commented 3 years ago

I'll leave Ron to address the question.

"As part of XMG, I am proposing that all future codeListValues reside in a single codelists.xml"

As long as they have clear fragment identifier targets (something of type xml:id that can be exposed with html id), that can work. I believe it would be better for them to be hosted on a vocab server capable of providing responses in a variety of ways - that is, I believe the code lists should be decomposed & hosted on a registry server. Codelists are essentially a set of terms. But at present, our standards say that instances should reference the code list (by URL) and the code list item using two different attributes. So we "have to" keep the existing ones as they are.

ejbleys commented 3 years ago

What I propose (as an interim process prior to registry of vocab service) has: Web accessible nested URLs for each: standard; namespace; codelist; and codeListValue (using ids as anchors within the file) . It takes advantage of the recursive look in catalogue (ISO 19139) using sub-catalogue association.

Cheers E Evert Bleys 4 Tudor Place HUGHES ACT 2605 Australia email: @. Mob: +61 (0)411 483 876 Land: +61 (0)2 6281 1773 Skype: @.

On 2021-04-15, at 5:23 pm, Peter Parslow @.***> wrote:

I'll leave Ron to address the question.

"As part of XMG, I am proposing that all future codeListValues reside in a single codelists.xml"

As long as they have clear fragment identifier targets (something of type xml:id that can be exposed with html id), that can work. I believe it would be better for them to be hosted on a vocab server capable of providing responses in a variety of ways - that is, I believe the code lists should be decomposed & hosted on a registry server. Codelists are essentially a set of terms. But at present, our standards say that instances should reference the code list (by URL) and the code list item using two different attributes. So we "have to" keep the existing ones as they are.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ISO-TC211/schemas/issues/37#issuecomment-820183816, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBGJZ5CA7JNHY35JPZHBILTI2H5TANCNFSM426RF36Q.

ronaldtse commented 3 years ago

In light of the above discussion of code lists and resources:

I just don't want these things to pollute the current schema naming pattern of:

I wonder if we should place them in a place like:

ejbleys commented 3 years ago

https://schemas.isotc211.org/resources/{standard-path-pattern}/{filename} works for me, but only for new material, as systems have already been updated to existing locations on https://schemas.isotc211.org/ Cheers e Evert Bleys 4 Tudor Place HUGHES ACT 2605 Australia +61 (0)2 62811773 +61 (0)411 483 876 @. Skype @.

On 2021-04-16, at 4:18 pm, Ronald Tse @.***> wrote:

In light of the above discussion of code lists and resources:

Tools are by nature different from data. Given this repository is called "schemas", they definitely don't belong to the schema category. Let's keep them in a separate repository? e.g. "xml-resources"?

XML codelists are not schemas, they can "exist" inside schemas. If they are offered separately as XML files, they should probably be considered "supporting resources".

I just don't want these things to pollute the current schema naming pattern of:

https://schemas.isotc211.org/{standard-path-pattern}/{filename} I wonder if we should place them in a place like:

https://resources.isotc211.org/{standard-path-pattern}/{filename} https://schemas.isotc211.org/resources/{standard-path-pattern}/{filename} — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ISO-TC211/schemas/issues/37#issuecomment-820938061, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIBGJZ3D5KWVVOLJ7UNB4V3TI7JDRANCNFSM426RF36Q.

PeterParslow commented 3 years ago

If you want to change or add to the URI pattern(s) that are in the "good practice", could you get all the "MG" convenors to agree. It would be good to document there where this is a transition - showing that some standards were published before the pattern was agreed.

https://committee.iso.org/sites/tc211/home/resolutions/isotc-211-good-practices/--structure-of-uris-in-isotc-211.html - no pattern there for "resources", but as Evert says, some standards do describe them.

ronaldtse commented 1 year ago

HTML files that contain description of schemas - these annotations should be subsumed into the XSD files Propose to remove them and use the new automated documentation generation pipeline

The new pipeline has been in place, and we have just migrated (and cleaned up) all the descriptions into their proper locations so they are now viewable.

ronaldtse commented 1 year ago

A location is merely a re-direct. For example, I see a number of duplicated files in the paths "1.0" to "1.0.0" because presumably the new correct place is "1.0.0" but we want to retain a link from "1.0" to it. In this case this is a re-direct and we should not store duplicated files.

I have done this in #62 to create symlinks for all {major}.{minor} to {major}.{minor}.{patch}, and verified that all {major}.{minor} pages work.

ejbleys commented 1 year ago

Hi RonThe rationale for the …/1.0 existing whilst the schemas are in …/1.0.# was so that the namespace URI would be resolvable.That rationale still holds.Evert Bleys4 Tudor PlaceHUGHES ACTAustraliaMob: 0411 483 876On 31 Mar 2023, at 01:44, Ronald Tse @.***> wrote:

A location is merely a re-direct. For example, I see a number of duplicated files in the paths "1.0" to "1.0.0" because presumably the new correct place is "1.0.0" but we want to retain a link from "1.0" to it. In this case this is a re-direct and we should not store duplicated files.

I have done this in #62 to create symlinks for all {major}.{minor} to {major}.{minor}.{patch}, and verified that all {major}.{minor} pages work.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>