w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
144 stars 55 forks source link

Refine the way DCAT links distribution to structural metadata. #1426

Open riccardoAlbertoni opened 2 years ago

riccardoAlbertoni commented 2 years ago

DCAT 2 provides dcterms:conformsTo to indicate the model, schema, ontology, view, or profile that representation of a dataset conforms to. However, dct:conformsTo can link different kinds of standards, and not all the URIs used as objects for this property are necessarily self-describing.

There is a need to refine the DCAT 2 approach to increase the machine-readability of the link to not self-describing standards.

Please consider #1418 for the preliminary discussion of this requirement. Also, consider that the sought solution should work with any kind of distribution, not only for tabular data.

smrgeoinfo commented 2 years ago

One approach is to factor 'conformsTo' along two axes:

  1. the content axis-- what are the properties that are specified with values in the dataset representation. I. e. what are the conceptual properties that are quantified for data instances in the described data. These have granularity-- e.g. 'air temperature', 'air temperature 6 m above surface', 'air temperature 6 m above surface measured by procedure Y (10 min time average)'
  2. how are the values represented-- string? controlled vocabulary?, units of measure? Integer?, decimal?, time series (station, time), binary grid (X, Y, [Z], Time).

A data interchange profile might specify one or both of these, and that is useful for clients recognizing the profile URI. The content axis is probably more useful in a general discovery scenario; the representation details are critical for machine interoperability. Explicit enumeration of the the properties that are quantified for data instances in the dataset (e.g. schema.org variableMesured) are probably most useful for data discovery.

davebrowning commented 1 year ago

Reason for update: As DCAT v3 moves through review and hopefully ratification, we want to make sure that open issues and feedback that have yet to be completely addressed are properly recorded and tagged/assigned in github to both clarify their status and to help review and prioritise as a source of improvements and new requirements in future DCAT versions