linked-statistics / COOS

Core ontology for official statistics
Creative Commons Attribution 4.0 International
5 stars 5 forks source link

Comments on datasets in relation to DCAT #81

Closed ChLaaboudi closed 1 year ago

ChLaaboudi commented 2 years ago

Comments on datasets in relation to DCAT

I. In Section 5, we have defined 5 types (sub-classes) of coos:StatisticalDataset : Dimensional, KeyValue, Transposed, Rectangular and Graph. We explained that each type is not exclusive and several types can be used to structure the same data. Nevertheless, it is not clear how the types are related and linked to the master dataset and how to apply the properties coos:metadataFor, coos:presentation, coos;content for the characterization of the metadatasets.

In DCAT, there are two options for representing the dataset types:

Option 1: Representing each sub-type as a distribution (dcat:Distribution) of the main StatisticalDataset. In that case, the type shall be assigned as a property of the distribution (dct:type) and managed as a controlled vocabulary.

Option 2: There is a very interesting approach for representing "data series" proposed in DCAT-3 (still a draft). They define a dcat:DatasetSeries (in our case, coos:StatisticalDataset), while the different representations ( the sub-types: dimensional, keyValue, Transposed, etc) are associated to the dcat:DatasetSeries with the property.


II. We have proposed the following properties in the domain coos:StatisticalProduct to further qualify statistical datasets: 1) coos:content with the values: data, metadata, analysis or model. At my opinion, the value "data" is implicitely relevant for any coos:StatisticalDataset. Analysis or Model are rather to be associated in the context of a normative document, compliant framework.

2) coos:presentation with the values: Dataset, Publication, Visualisation, Infographic, Thematic Map or Interactive Dataset or Visualisation can typically be used for qualifying a type of distribution (with an access URL to the resource)

3) coos:metadataFor that can be used to associate a set of metadata to the dataset that it qualifies. The description of the property shall be enhanced. What is the range for this property (The sub-classes)? Its usage seems very close to dcat:distribution linking the dataset description to the data.


III. In the specifications, we haven't covered how to reference/formalise the documentation about the dataset (Metadata Structure, conformance to a standard)

FranckCo commented 2 years ago

Clarified title

FranckCo commented 1 year ago

Following 9/7 meeting: create dedicated issues about the different subjects listed above

FranckCo commented 1 year ago

Closing the issue: content is split between issue #97 and issue #99.