wmo-im / wcmp2

WMO Core Metadata Profile 2
https://wmo-im.github.io/wcmp2
6 stars 3 forks source link

clarify keywords and themes/concepts #54

Closed tomkralidis closed 1 year ago

tomkralidis commented 2 years ago

WCMP2 (via OARec) provides both keywords and themes/concepts properties. These are both used as (optional) catalogue queryables. The general idea is that keywords provides a list of free form terms/tags, whereas themes/concepts provide terms from controlled vocabularies.

It may be challenging for users to know which to use for which purpose, so we should make clear which to use and when/for what purpose.

steingod commented 2 years ago

Which one is intended for description of e.g. variables of a dataset? If there is not a predefined terminology to use (will require an easy governance mechanism for addition of new terms as experienced by both CF and GCMD) there is a specific need to identify the terminology using URI/identifiers. This will also enable semantic translations in search interfaces targeting different communities.

tomkralidis commented 2 years ago

themes/concepts would be in scope for variables, where the concepts would describe the variables and the scheme would point to the associated vocabulary.

steingod commented 2 years ago

Makes sense. How do you relate to the identifier of a concept, assuming the content of concept is a human readable form of the concept itself? And it would be no easy distinction between various concepts (e.g. what is a variable/parameter, what is the generation mechanism e.g. satellite, numerical simulation or observation) without parsing the schema then?

steingod commented 2 years ago

Concerning utilisation of identifiers for concepts etc I think this can be illustrated by the science on schema.org approach for similar terms:

    {
      "@type": "DefinedTerm",
      "name": "OCEANS",
      "inDefinedTermSet": "https://gcmd.earthdata.nasa.gov/kms/concepts/concept_scheme/sciencekeywords",
      "url": "https://gcmd.earthdata.nasa.gov/kms/concept/91697b7d-8f2b-4954-850e-61d5f61c867d",
      "termCode": "91697b7d-8f2b-4954-850e-61d5f61c867d"
    },
    {
      "@type": "DefinedTerm",
      "name": "ice core studies",
      "inDefinedTermSet": "https://vocabularyserver.com/cnr/ml/snowterm/en/",
      "url": "https://vocabularyserver.com/cnr/ml/snowterm/en/index.php?tema=29330",
      "identifier": {
        "@type": "PropertyValue",
        "propertyID": "https://registry.identifiers.org/registry/ark",
        "value": "ark:/99152/t3v4yo3eeqepj0",
        "url": "https://vocabularyserver.com/cnr/ml/snowterm/en/?ark=ark:/99152/t3v4yo3eeqepj0"
      }
    }

There are different ways of handling this, but as you can see in addition to identifying the vocabulary you have identifiers for the individual elements which is easier to use than the text when doing on the fly semantic mappings between different terminologies.

Would it make sense to have a type characteristic (e.g. parameter) here as well? To identify scope for a term easier than look up the terminology (helps on parsing side to determine ways forward).

tomkralidis commented 1 year ago

Qualified concepts came up today during a WCMP2/WIGOS/CDM discussion. Associated issue put forth in https://github.com/opengeospatial/ogcapi-records/issues/188

tomkralidis commented 1 year ago

FYI based on updates in https://github.com/opengeospatial/ogcapi-records/issues/188, PR issued in #78

tomkralidis commented 1 year ago

As discussed during TT-WISMD 2022-12-16, there was some discussion/concern that building out concepts as objects (vs flat array) makes WCMP2 less simple. We do need more documentation to articulate the value of the updated structure in #78.

tomkralidis commented 1 year ago

As discussed during TT-WISMD 2023-01-11:

tomkralidis commented 1 year ago

PR #78 updated.

tomkralidis commented 1 year ago

Implemented in #78.