cf-convention / cf-convention.github.io

sources for website cf-conventions.org
cf-convention.github.io
Creative Commons Zero v1.0 Universal
33 stars 42 forks source link

Harmonize and improve XSD Schema files and their link to the XML standard name table files #457

Open larsbarring opened 6 months ago

larsbarring commented 6 months ago

The over all aim of this enhancement proposal is to improve how the standard name tables are set up and formatted as specified in Appendix B, as well as in schema files. With these improvements it will become difficult or impossible for simple formatting errors to creep into the published tables.

Originally initiated by an interest in the ideas put forth in https://github.com/cf-convention/cf-convention.github.io/issues/7 I started looking in some more detail at the at the XML files underlying the standard name tables. The my attention was then drawn to https://github.com/cf-convention/cf-conventions/issues/132 , as well as the recent https://github.com/cf-convention/vocabularies/issues/56.

Digging further I found that there are some imperfection in the formatting of the different version of the xml files, partly as they are disconnected from the xsd schema files stated in the xml files. This disconnect makes it easier for trivial typos and simple formatting mistakes to slip through to be published.

Sorting out this will help pave the way towards moving forward or closing several open issues related to the standard name tables and how they are presented at the web site, namely:

However, rectifying this will require several concrete and rather independent steps, such as the format of the xml files as specified in Appendix B of the conventions document, format and location of the xsd schema files, and where it is published on the web site. Hence, this issue is intended to provide an overview of all this and help to keep track of the progress of the different issues and pull requests. Currently the following are identified (will be edited as the work progresses).

All suggested changes have been tested in a separate fork of the repo.

The Date column indicates when a PR can be merged after that the issue has received enough support, and the PR has been reviewed.

The Status column indicates that there is a a PR ready to be reviewed. In several cases there is a draft PR, which means that it is directly dependent on previous PRs to be merged, but also that there is enough information to anticipate what the final PR would include, which allows preliminary reviews.

Status Issue Date
Done #500 Standard names: Add "Conventions" string to the standard name xml table header 2024-02-23
Done #509 In exceptional cases allow a standard name to be aliased into two alternatives 2024-03-15
Done #511 Add a "first published date" to the xml files 2024-03-29
Done #433 XML schema definition file, i.e. their location 2024-04-10
Done #459 Harmonise content of the the schema definition files (and thus the format of the xml files) 2024-04-17
(conditional)
Done #469 Publication of the standard name table XML schema file on the website 2024-04-17
Done #516 Update the XML format specification in Appendix B to provide a robust link to the XML schema file 2024-04-17
Done NEW: #481 Minor update to the newly added XML schema file defect=>"now"
Ongoing #470 Implementation of the schema file in all versions of the XML standard name file
Draft PR exists #471 Update the XSL file for conversion of XML to html after issue
#470 has
been closed
JonathanGregory commented 6 months ago

Thanks for your time and meticulous work on these issues.

DocOtak commented 5 months ago

Would it be worth having the aliases and standard name entries be sorted? Right now I think the standard names are, but the aliases appear in some arbitrary order (possibly entry order).

I don't know if XSD supports this either.

larsbarring commented 5 months ago

I think that this is worth considering. I guess (really guess) that possibly XSD can enforce the aliases to be sorted, but the actual sorting has to be done when producing the XML file. So, let's keep this in mind when we arrive there.

larsbarring commented 5 months ago

Following a suggestion (offline ) from @JonathanGregory (thanks!) I have created issues for all the steps that are necessary (see the table above) to implement a proper link between the standard name table XML files and their schema file. In this way it is possible to get a better overview of the chain of changes necessary, and discuss them. While the individual issues are spread out across this repo. and the cf-conventions repo. they should be considered as a sequence of changes that together reach the aim of improving the XML format.

One is already implemented, the next three have already reach agreement and are ready to go after the cool down period (if no new concerns arise). For the remaining five it is a bit difficult to already now create pull requests, because their content depends on the implementation of the previous ones.

Despite that, I hope that we can start discussing them to move them forward towards agreement and implementation. I suggest that the discussion take place in the individual issues, and that we keep of the overall progress in the table above. If necessary we can always add new issues if need be.

ping @JonathanGregory, @japamment, @DocOtak, @davidhassell, @sadielbartholomew

sadielbartholomew commented 5 months ago

Thanks Lars, and all. That makes sense to me. Let us know if we can help in any particular way towards this.

larsbarring commented 5 months ago

Hi Sadie @sadielbartholomew,

Thanks for the offer -- there is in fact one thing you could do:

Merge PR https://github.com/cf-convention/cf-conventions/issues/510 that implements https://github.com/cf-convention/cf-conventions/issues/509 (which is supported by @JonathanGregory, @davidhassell and @japamment), because ...

... then I can get rid of the conflict in history.md of https://github.com/cf-convention/cf-conventions/issues/510 ...

which in turns helps when I create a draft PR for https://github.com/cf-convention/cf-conventions/issues/516. That is, I will be able to move down the history.md conflict one step in this chain of PRs.

Many thanks, Lars

sadielbartholomew commented 5 months ago

Thanks - do you just want me to press the 'merge' button, or is to to review it as well? Happy to review but just want to clarify what exactly you want, since pressing the button can be done by anyone, right?

larsbarring commented 5 months ago

I think that just pressing the button should be all right. In the associated issue Jonathan, David wrote that the PR looked OK. The procedure is that the proposer and PR producers should not do the merging. But it never hurts with an extra pair of eyes if you feel like it ... :-))

sadielbartholomew commented 5 months ago

Thanks Lars. It was a very small PR so I have reviewed it and it is all good, though I did notice there was a merge conflict to resolve in history.adoc so I have resolved that in https://github.com/cf-convention/cf-conventions/pull/510/commits/4cc3bc19d640ddaffa7ec8e578f0fce31ca970cb (quite trivial and uncontroversial, otherwise I would have checked in with you about it before merging). So all sorted now.