ruisdael-observatory / Ruisdael-Data-Catalog

repository to aid the creation of Ruisdael Data Catalog - documentation, resarch and scripts
https://ruisdael-catalog.citg.tudelft.nl
0 stars 0 forks source link

choose a vocabulary and schema to descibe the catalog resources #1

Closed andrecastro0o closed 3 months ago

andrecastro0o commented 3 months ago

In order to build a data catalog we need to decide upon what terms and schema we will use to describe the catalog resources, mainly the datasets.

I see two possible options

A: Data Catalog Vocabulary DCAT & DCAT-AP Application Profile

Data Catalog Vocabulary DCAT

Most relevant DCAT classes (for our data catalog context) :

DCAT-AP Application Profile

the prime objective of the Application Profile is to enhance data findability and promote reusability. To achieve this goal, datasets should be coherently documented. To enable this, the Application Profile considers several essential aspects, including among others:

Understanding the data or service structure, and how to get access to the data Information on scope or purpose of the data Legal information Knowledge on data publishers, and any other agents involved Knowledge of data availability and change policies

an application that provides metadata MUST (relevant items for Ruisdael catalog):

(@andrecastro0o Bold - most relevant)

Provide a description of the Catalogue, including at least the mandatory properties specified for the class Catalogue. Provide descriptions of Datasets in the Catalogue, including at least the mandatory properties for the class Dataset. Provide descriptions of Distributions, if any, of Datasets in the Catalogue, including at least the mandatory properties for the class Distribution. Provide descriptions of Data Services, if any, of Datasets in the Catalogue, including at least the mandatory properties for the class Data Service. Provide descriptions of all organisations involved in the descriptions of Catalogue and Datasets, including at least the mandatory properties for the class Agent.

DCAT-AP Main entities: core classes and properties (mandatory/recommend)

Classes:

Geolocation properties denoting a geo-point lat,lon,alt, or polygons, might need to include: what properties to use for those?

DCAT-AP diagram

controlled vocs for DCAT properties

See https://semiceu.github.io/DCAT-AP/releases/3.0.0/#controlled-vocabularies-to-be-used

B: DataCite Metadata Schema

See DataCite Mandatory Properties and compare to DCAT-AP classes + properties

andrecastro0o commented 3 months ago

issue is now in the markdown file research/01-metadata-schema.md