cf-convention / vocabularies

Issues and source files for CF controlled vocabularies
3 stars 1 forks source link

How to deal with standard names having a <space> character #7

Closed japamment closed 5 months ago

japamment commented 6 months ago

Discussed in https://github.com/orgs/cf-convention/discussions/310

Originally posted by **larsbarring** May 7, 2024 ### Topic for discussion In earlier versions of the standard name table a small number of standard names were accidentally defined with a `space` character in the name. This does not conform to the CF convention but was not easy to spot. Once this happened the corrected standard name was defined and in most cases the non-conformant standard names were aliased into the correct one (cf. [here](https://github.com/cf-convention/cf-conventions/issues/132)). See the table below for details. Defined in
version(s) | Standard name
(black square = ``) | Aliased to correct
name in version(s) -- | -- | -- 8 -- 10 | mole_fraction_of_chlorine■dioxide_in_air | 11 -- 84 8 -- 10 | mole_fraction_of_chlorine■monoxide_in_air | 11 -- 84 8 -- 10 | mole_fraction_of_dichlorine■peroxide_in_air | 11 -- 84 8 -- 10 | mole_fraction_of_hypochlorous■acid_in_air | 11 -- 84 8 | mole_fraction_of_methyl_chloride■_in_air | no alias, correct in 9 -- 84 28 -- 36 | rate_of_■hydroxyl_radical_destruction_due_to_reaction_with_nmvoc | 37 -- 84 10 -- 18 | tendency_of_potential_energy_content_of_ocean_layer_due_to_diffusion■ | no alias, correct 19 -- 84 12 -- 19 | atmosphere_moles_of_acetic_acid■ | no alias, correct 20 -- 84 12 -- 19 | atmosphere_moles_of_alpha_hexachlorocyclohexane■| no alias, correct 20 -- 84 12 -- 19 | tendency_of_atmosphere_mass_content_of_nitrogen_monoxide_due_to_emission■ | no alias, correct in 20 -- 84 In the overhaul of all the already published versions of the standard name table (see [here](https://github.com/cf-convention/cf-convention.github.io/issues/457)) it became clear that the xml syntax in combination with the xml schema definition used by CF does not allow a space in the standard name irrespective of whether it appears in a definition or as an alias. Thus, it is not obvious that common xml parsing tools are guaranteed to give the correct result. Because of the double problem of not conforming to CF rules and violating XML syntax I would like to suggest that standard names containing a `space` are not allowed in definitions or in aliases. This can be implemented with various degree of strictness (from minimal to maximal): 1. Leave all already published standard name tables as is, but do not include aliases in future versions. 2. Remove aliases in table version N and newer. 3. Only keep the first alias and remove all subsequent ones. 4. Correct the standard name definitions by removing the spurious space. For alternative 1-3, to maintain the lineage between the wrong and the corrected names an alias should be added for those that do not already have one. This should be done only in the version where the correct version first appears.
japamment commented 6 months ago

I created this issue from a discussion ... it doesn't seem to have transferred the comments with it.

japamment commented 6 months ago

No further comments have been received so this issue is now accepted. The aliases containing spaces will not appear in V85 of the standard name table onwards. We will soon reprocess earlier versions of the table to remove the spaces and make the XML schema corrections.

larsbarring commented 5 months ago

I am closing this because the suggested change has been accepted and successfully implemented in standard name table version 85. Implementation in the earlier table versions will happen in relation to #470.