tdwg / hc

Humboldt Core Charter, a Task Group of the Observations & specimens Interest Group
https://eco.tdwg.org
8 stars 2 forks source link

Do we have to define classes within HC? #16

Closed pzermoglio closed 1 year ago

pzermoglio commented 3 years ago

As of right now, during the first review round, terms originally in HC have been provisionally assigned to existing DwC classes (or record-level). From the perspective of a potential user of the HC extension, and specifically in the Quick Reference Guide, it may be useful to have them included in classes, which would render them in the guide visually better organized.

So options would be:

  1. Not have classes defined. All terms would be at the (event) record-level. In the QRG all listed in an order that we can predefine.

  2. Use classes from DwC (+ potentially record-level). Right now we would have 2 classes, Event and Identification.

  3. Define some classes (+ potentially use record-level). This would probably make sense to a user, but should also make sense conceptually speaking, i.e., we should probably NOT come up with classes just for the sake of categorization, but rather because they are semantically meaningful.

    • One possibility would be to evaluate the categories originally used in the HC paper to group terms:
      • general information
      • identification information
      • site details
      • geospatial information (no terms currently included in the proposed first round review vocab, as all would already be in the Event Core)
      • temporal scope
      • organismal information
      • environmental conditions
      • methodology
      • effort
      • data quality (no terms currently included in the proposed first round review vocab, as the term in this category would be reported in EML)
      • completeness

Note that for all three cases, terms borrowed from DwC (right now 2 terms in the Identification class) would already have the attribute organizedInClass=Identification in rdf.

Thoughts, everyone?

baskaufs commented 3 years ago

The field tdwgutility_organizedInClass can contain a "real" standard-defined classes or non-standard "convenience" classes (defined in the tdwgutility: namespace). The purpose of this column is to divide the terms into categories within the Quick Reference Guide and the Term List Document.

Dividing terms up in this way does not establish any formal semantics (like a domain declaration). Rather it's a sort of "suggestion" about the kind of subject that these properties should be used with. So a term in the Occurrence category should probably be used in Occurrence records, although there isn't really anything stopping people from using it in some other way.

So whether we want to actually define official classes or not is up to us. The main advantage would be to put a stake in the ground trying to more clearly define the category. By officially defining a class term, one could potentially clarify how that class is related to others using an ontology or a graph model. If all we want to do is to specify how the terms should be laid out on the page in the QRG, that's probably overkill.

The script that generates the List of Terms document allows the category text to be different from the actual class label (for example, Audubon Core calls the category "Geography Vocabulary" when in reality, they are all "Location" terms). There is also the possibility to provide additional commentary about a category (see for example the comments about Service Access Point Vocabulary.

It is also possible to mix real and convenience categories in the organizedInClass column. Most of the categories in the DwC QRG are "real" but there's one artificial one ("useWithIRI") to pull those terms into a separate section of the guide. It is also possible to change an artificial category to a real one in the future. We did that with the ServiceAccessPoint class in Audubon Core.

tucotuco commented 3 years ago

It looks like the answer to the issue title is, "No, we don't have to define classes in HC, even to organize terms in categories in a Quick Reference Guide." I don't see any natural new classes jumping out. If there are ideas for them, they should be defined now with the rest of the terms. Any convenience categories for the Quick Reference Guide can wait until that is ready to build for the first time.

tucotuco commented 1 year ago

No classes needed to be defined for the Extension proposal. In Darwin Core we resolved the issue of being to organize terms in categories that are not classes, and this capability was incorporated for the Humboldt Extension with categories such as Site, Habitat Scope, Temporal Scope, etc.