tdwg / tag

Technical Architecture Group
https://tag.tdwg.org/
5 stars 0 forks source link

Guidance for vocabulary maintenance and creation regarding the status of term properties and documents #12

Closed baskaufs closed 2 years ago

baskaufs commented 7 years ago

The Standards Documentation Specification (SDS) puts to an end the previous practice of designating particular standards documents as "Type 1" (normative) and "Type 2" (non-normative). Instead, it says in Section 1 "Users must be able to easily determine which parts of the standard are definitive (normative) and which are informative (non-normative)." It provides the following definitions:

normative content - prescriptive parts of a standard that specify features, characteristics, or behaviors that are necessary to comply with the standard

non-normative content - informative parts of a standard that provide supplemental information such as history, examples, and additional explanation beyond the information necessary to comply with the standard.

In the Darwin Core (DwC) hackathon from 2017-09-29 to 2017-10-01, we had to deal with this in a practical way as we sought to streamline the mass of pre-existing Darwin Core content, and to make the cleaned-up content conform to the SDS. There were several places where we made decisions about whether particular content should be considered a normative part of the standard, a non-normative part of the standard, or outside of the standard itself.

In these decisions, we were guided in part by Section 3 of the Vocabulary Maintenance Specification. That section indicates that the extent of the requirements for public comment, executive approval, public notification, etc. triggered by changes depended on the status of the content being changed. For example, changing normative term definitions clearly requires the full comment and approval process, while correcting typos or examples can be carried out at will by the maintaining interest group.

Given this understanding, the embryonic DwC Maintenance Group decided to remove entirely some term properties from the metadata about the terms. For example, the tdwgutility:organizedInClass (http://rs.tdwg.org/dwc/terms/attributes/organizedInClass) property was really only being used to organize the DwC terms into sections of the Quick Reference Guide and had no real relevance to users of the vocabulary. So we pulled that property out of the primary metadata associated with the terms. However, there were some gray areas. In particular, should labels be considered non-normative? The SDS (Section 4.5) says that English labels should be included in the term metadata, and labels in other languages should be maintained outside of the standard. But should the English labels be considered a normative part of the term metadata? In another example, Audubon Core specifies for each term whether it is part of the primary layer, whether it is required, and whether it can be repeated. Should these properties be included within the vocabulary itself (as is currently the case), or should they be considered to be part of a sort of application profile that may be required by some communities of interest but not others?

It seems to me that it should be within the portfolio of the TAG to provide guidance across vocabularies as to which term properties should be considered normative, which should be considered non-normative but included within the standard, and which properties should be asserted outside of the standard. I think that it is particularly important to get out in front of this issue given that there will probably be one or more task groups developing controlled vocabularies in the near future.

In addition, what kinds of documents should be considered within the standard and which should be maintained outside of the standard? For example, the DwC hackathon group concluded that the DwC Text, XML, and RDF Guides should be considered as part of the standard, since they contained normative descriptive content (text descriptions of how to conform to the specification). However, we concluded that the Quick Reference Guide should be considered outside of the standard, since it needs to be nimbly maintained for maximum ease of use.

Perhaps there should be a working group within the TAG to address this issue and to come up with some recommendations to be acted upon by the full TAG.

baskaufs commented 2 years ago

Since this issue was raised in 2017, documents for existing current standards have been brought into conformance with the SDS. In the "List of terms" documents (generated from the authoritative metadata tables in the rs.tdwg.org, there are "Status of the content of this document" sections that explicitly say that labels are not normative. See for example this document. So that precedent has been set.

Another issue raised here is about inclusion of properties other than those laid out in the SDS. This has been left up to the Maintenance Groups to designate. In the example given here, the Required and Repeatable values were designated as normative.

The question of whether documents should be included within or outside of the standard seems to have been settled based on whether the documents contain any normative content. The DwC Quick Reference guide is out, the XML guide is in, etc. In Audubon Core, the Structure document is in as it defines how documents should be structured to conform with AC expectations. The one unclear case is the "Audubon Core Guide", which contains no normative content, but is included in the standard (see this table. However, I think the general rule of thumb has been established that a document should probably contain at least some normative content to be part of the official standard.

I've put a note in a stub checklist so these conventions won't be lost.