This is a draft version of Health-RI metadata schema 2.0 intended for review.
Latest published version (version 1.0.0) available here.
This branch contains the draft version of the plateau 2 core and generic health metadata schema, detailing the classes and entities involved and offering usage notes for developers. It addresses the schema's design and application but excludes discussion on the national catalog and its onboarding process. It aims at a technical audience tasked with reviewing the metadata schema.
Feedback to the draft version is being collected via issues in this repository, preferably via the provided template.
Building on the 1st version of the metadata schema, the scope of the plateau 2 version is to incorporate both DCAT-AP NL and the (yet to be finalized) HealthDCAT-AP, as well as Health-RI specific requirements / needs for the National Health Data Catalogue.
It therefore introduces several health-related properties (indicated in blue in the UML diagram below), with (where applicable) suggested or required controlled vocabularies.
In addition, several ELSI-related metadata fields, as gathered by the Health-RI ELSI team, are included in this draft version, although not mandatory. The use of these properties will be explored and evaluated once the new version is implemented in the catalogue.
Next to that, the Project and Study classes are currently still under development. Therefore, the proposed properties, cardinalities and ranges are a starting point, and your input on these two classes is very welcome! If you would like to join the discussions on these two classes, feel free to contact us.
Finally, the newly introduced property data origin
(in grey in the UML), with the goal to discriminate non-synthetic from synthetic data, is included in the draft, but has to be further modelled. We now propose to further indicate the nature of the data (eg. Whole genome sequencing data, or questionnaire data) with healthdcatap:healthCategory
and healthdcatap:healthTheme
.
In the version 2 of the schema, we extended the current version, which is based on the DCAT-AP 3.0 specification, by adding new properties from HealthDCAT-AP and DCAT-AP NL, as well as changing cardinalities in order to make it compatible with both extensions. Please note that HealthDCAT-AP is still in its draft version, so we made some properties less strict than what it currently specifies. Once the proper release is out, we will reevaluate and make our HRI schema compatible with the HealthDCAT-AP.
In the HRI schema, we categorize components into mandatory
and recommended
classes and properties. A potential third category, optional
, may be introduced in the future.
In the context of data exchange:
Mandatory Class
: Senders MUST provide information about instances of the class; Receivers MUST process information about instances of the class.
Recommended Class
: Senders SHOULD provide information about instances of the class if available; Receivers MUST process information about instances of the class.
Optional Class
: Senders MAY provide the information but are not obliged to do so; Receivers MUST process information about instances of the class.
Mandatory property
: Senders MUST provide the information for that property; Receivers MUST process the information for that property.
Recommended property
: Senders SHOULD provide the information if available; Receivers MUST process the information for that property.
Optional property
: Senders MAY provide the information but are not obliged to do so; Receivers MUST process the information for that property.
According to DCAT-AP:
An Application Profile defines the mandatory, recommended, and optional components for a specific use case by leveraging terminology from foundational standards. Additionally, it suggests standardized vocabularies to maintain consistency in the use of terms and data.
A Dataset is a self-contained set of data produced by a specific organization, which can be accessed or downloaded for various uses. A Data Portal is an online platform that offers a catalog of datasets and tools to help users locate and utilize these datasets effectively.
Prefix | Namespace IRI | Source |
---|---|---|
adms |
http://www.w3.org/ns/adms# |
VOCAB-ADMS |
dcat |
http://www.w3.org/ns/dcat# |
VOCAB-DCAT |
dct |
http://purl.org/dc/terms/ |
DCT |
foaf |
http://xmlns.com/foaf/0.1/ |
FOAF |
owl |
http://www.w3.org/2002/07/owl# |
OWL2-SYNTAX |
rdf |
http://www.w3.org/1999/02/22-rdf-syntax-ns# |
RDF-SYNTAX-GRAMMAR |
rdfs |
http://www.w3.org/2000/01/rdf-schema# |
RDF-SCHEMA |
skos |
http://www.w3.org/2004/02/skos/core# |
SKOS-REFERENCE |
spdx |
http://spdx.org/rdf/terms# |
SPDX |
time |
http://www.w3.org/2006/time# |
OWL-TIME |
xsd |
http://www.w3.org/2001/XMLSchema# |
XMLSCHEMA11-2 |
vcard |
http://www.w3.org/2006/vcard/ns# |
VCARD |
An overview of the Metadata schema core is presented in the UML diagram depicted below. The UML showcases the primary classes (entities), excluding the detailed definitions such as rdfs:label and rdfs:comment. Each block denotes a class and comprises a list of its attributes (properties). If a class is connected to another class by a closed arrow, indicating that it inherits all properties from the other class. For example, dcat:DatasetSeries
inherits from dcat:Dataset
which inherits from dcat:Resource
. The other arrows, represent relations and contain the type of relation, such as dcat:Dataset
connects to a dcat:DatasetSeries
via the predicate dcat:inSeries
, and include the cardinality, such as dcat:Dataset
can be connected via dcat:inSeries
to zero or more dcat:DatasetSeries
.
Next to the UML, a tabular overview of all classes and properties, including their range, cardinality, controlled vocabulary (if applicable) and usage note is findable below. The same information can be referred to in this sheet. In this sheet, we also state the origin of the (new) constrain (DCAT-AP v3, DCAT-AP NL or HealthDCAT-AP).
Class name | Definition | Usage Note | URI | Example |
---|---|---|---|---|
Dataset | A resource type. A meaningful collection of data, published or curated by a single organisation or individual, and available for access or download in one or more representations. |
When focusing on health data, a dataset typically contains structured information gathered from a study or research project related to health topics. This might include clinical trial results, public health statistics, patient records, survey data, etc. How the data in a dataset can be accessed is defined in the Distribution, which usually points to the actual data files available for access or download. Datasets are often included in a catalog, which organizes and provides metadata about multiple datasets, making them easier to find and use. The term 'organization or individual' refers to any entity responsible for creating, maintaining, or distributing the dataset. |
dcat:Dataset |
Questionnaire data of the Personalised RISk-based MAmmascreening Study (PRISMA), Clinical data for Inflammatory Bowel Disease (IBD) from AUMC, LUMC and UMCG |
Catalog | A catalog that is listed in the National catalog. | Used to describe a bundle of datasets (and other resources) under a single title, for example a collection or a study. | dcat:Catalog |
NA |
Agent | An entity that is associated with catalog and/or Datasets. | A person or organization that is associated with the catalogue and/or datasets. | foaf:Agent |
NA |
Cataloged Resource | Resource published or curated by a single agent. | This is an abstract class, we do not use this class, instead we use specifications of it (e.g. Dataset). This is mainly for a high level grouping and the reuse of properties. | dcat:Resource |
NA |
Kind | A description following the vCard specification, e.g. to provide telephone number and e-mail address for a contact point. | Used to describe contact information for Dataset and DatasetSeries. | vcard:Kind |
NA |
Class name | Definition | Usage Note | URI |
---|---|---|---|
Distribution | An available distribution of the dataset. | Used to describe the different ways that a single dataset can be made available in. I.e., it can be downloaded or it can be accessed online in one or more distributions (e.g. one in a downloadable .csv file, another file with an access or query webpage) | dcat:Distribution |
Dataset Series | A collection of datasets that are published separately, but share some characteristics that group them. | With Dataset Series we refer to data, somehow interrelated, that are published separately. An example is budget data split by year and/or country, instead of being made available in a single dataset. | dcat:DatasetSeries |
Data Service | A Resource type. A collection of operations that provides access to one or more datasets or data processing functions. |
The kind of service can be indicated using the dcterms:type property. Its value may be taken from a controlled vocabulary that should be defined in the community. |
dcat:DataService |
Project | A collective endeavour of some kind. The Project class represents the class of things that are 'projects'. These may be formal or informal, collective or individual. It is often useful to indicate the homepage of a Project. | Used to denote the information of a funded project, including funding agent. A project can consist of several studies. | foaf:Project |
Study | A Study represents the process by which a data set was generated or collected. | Used to describe the information of a study that generates or collects data described in a dataset. A study is connected to one project. | TBA |
Cataloged Resource is a generic concept from the DCAT vocabulary, that is rarely used directly, but indirectly through its extensions. We recommend avoiding using dcat:Resource
directly for your document and requesting a model extension or update, in case the type/class you need is not in this schema.
Class name | Definition | Usage Note | URI |
---|---|---|---|
Cataloged Resource | The class resource, everything. | This class is for grouping and class hierarchy relation purposes. | dcat:Resource |
A curated collection of metadata about resources. A web-based data catalog is typically represented as a single instance of this class.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality | Example |
---|---|---|---|---|---|---|
applicable legislation | The legislation that mandates the creation or management of the Catalog. | dcatap:applicableLegislation |
eli:LegalResource |
TBA | 1..* | NA |
contact point | Relevant contact information for the Catalogue. | dcat:contactPoint |
vcard:Kind |
TBA | 1 | NA |
description | A free-text account of the record. | dct:description |
rdfs:Literal |
A brief informative description of the catalogue. This property can be repeated for descriptions in different languages. | 1..* | This catalogue describes the core metadata of AUMC Inflammatory Bowel Disease datasets or This catalogue describes breast cancer imaging, clinical and omics datasets. |
publisher | An entity (organisation) responsible for making the Catalogue available. | dct:publisher |
foaf:Agent |
The organization that published the catalogue (e.g. the specific UMC in question). In case of a multicenter study, the publisher is the organisation who makes the catalogue available online. To list multiple organisations involved, refer to the "creator" property. | 1 | name: Radboud University Medical Center identifier: https://ror.org/05wg1m734 (see class foaf: Agent) |
title | A name given to the Catalogue. | dct:title |
rdfs:Literal |
A name given to the catalogue. This property can be repeated for providing titles in different languages. This is a required field and needs to be unique. | 1..* | Inflammatory Bowel Disease catalogue, Inflammatoire darmziekten catalogus |
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
catalog | A catalog that is listed in the catalog. | dcat:catalog |
dcat:Catalog |
NA | 0..* |
creator | An entity responsible for the creation of the catalogue. | dct:creator |
foaf:Agent |
NA | 0..* |
dataset | relates every catalog to its containing datasets. | dcat:dataset |
dcat:Dataset |
The connection to the one or more datasets that this catalog describes. | 0..* |
geographical coverage | A geographical area covered by the Catalogue. | dct:spatial |
dct:Location |
The EU Vocabularies Name Authority Lists must be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs must be used. For districts or neighbourhoods in NL, the Dutch vocab can be used. | 0..* |
has part | A related Catalogue that is part of the described Catalogue. | dct:hasPart |
dcat:Catalog |
NA | 0..* |
home page | A web page that acts as the main page for the Catalogue. | foaf:homepage |
foaf:Document |
Could be a page describing the catalogue, incl. link to catalogue. | 0..1 |
language | A language used in the textual metadata describing titles, descriptions, etc. of the Datasets in the Catalogue. | dct:language |
dct:LinguisticSystem |
NA | 0..* |
license | A licence under which the Catalogue can be used or reused. | dct:license |
dct:LicenseDocument |
NA | 0..1 |
modification date | The most recent date on which the Catalogue was modified. | dct:modified |
xsd:dateTime |
NA | 0..1 |
record | A Catalogue Record that is part of the Catalogue. | dcat:record |
dcat:CatalogRecord |
NA | 0..* |
release date | The date of formal issuance (e.g., publication) of the Catalogue. | dct:issued |
xsd:dateTime |
NA | 0..1 |
rights | A statement that specifies rights associated with the Catalogue. | dct:rights |
dct:RightsStatement |
NA | 0..1 |
service | A service that is listed in the catalog. | dcat:service |
dcat:DataService |
NA | 0..* |
temporal coverage | A temporal period that the Catalogue covers. | dct:temporal |
dct:PeriodOfTime |
NA | 0..* |
themes | A knowledge organisation system used to classify the Catalogue's Datasets. | dcat:themeTaxanomy |
skos:ConceptScheme |
This property refers to a knowledge organisation system used to classify the Catalogue's Datasets. It must have at least the value NAL:data-theme as this is the mandatory controlled vocabulary for dcat:theme. |
0..* |
A meaningful collection of data, published or curated by a single organisation or individual, and available for access or download in one or more representations.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality | Example |
---|---|---|---|---|---|---|
access rights | Information that indicates whether the Dataset is publicly accessible, has access restrictions or is not public. | dct:accessRights |
Rights Statement (IRI) | Information that indicates whether the Dataset is publicly accessible, has access restrictions or is not public. Use one of the following values from this vocabulary (:public, :restricted, :non-public). | 1 | http://publications.europa.eu/resource/authority/access-right/RESTRICTED |
applicable legislation | The legislation that mandates the creation or management of the Dataset. | dcatap:applicableLegislation |
eli:LegalResource |
For health datasets, the value must include the ELI of the EHDS Regulation. As multiple legislations may apply to the resource the maximum cardinality is not limited. | 1..* | NA |
contact point | Contact information that can be used for sending comments about the Dataset. | dcat:contactPoint |
vcard:Kind |
Contact information that can be used, for example, for sending requests for information or access to the dataset. Ideally, a data access committee or other service desk (a contact point that is rather persistent over time). | 1 | mailto: data-access-committee@xumc.nl with name Data Access Committee of the x UMC (see vcard:Kind) |
creator | An entity responsible for producing the dataset. | dct:creator |
foaf:Agent |
The person or persons responsible for creating the dataset. | 1..* | Jip Fictief, Inez Maginary, Fabio Abricated for name of foaf:Agent |
description | A free-text account of the Dataset. | dct:description |
rdfs:Literal |
A free-text informative description of the dataset. This property can be repeated for providing descriptions in different languages. | 1..* | The primary aim of the PRISMA study was to investigate the potential value of risk-tailored versus traditional breast cancer screening protocols in the Netherlands. Data collection took place between 2014-2019, resulting in ∼67,000 mammograms, ∼38,000 surveys, ∼10,000 blood samples and ∼600 saliva samples. |
geographical coverage | A geographic region that is covered by the Dataset. | dct:spatial |
dct:Location |
The EU Vocabularies Name Authority Lists must be used for continents, countries and places that are in those lists; if a particular location is not in one of the mentioned Named Authority Lists, Geonames URIs must be used. For districts or neighbourhoods in NL, the Dutch vocab can be used. | 1..* | http://publications.europa.eu/resource/authority/place/NLD_AMS |
health theme | A category of the Dataset or tag describing the Dataset. | healthdcatap:healthTheme |
skos:Concept |
A Dataset may be associated with multiple themes. Wikidata URIs MUST be used. | 1..* | https://www.wikidata.org/wiki/Q58624061 |
identifier | The main identifier for the Dataset, e.g. the URI or other unique identifier in the context of the Catalogue. | dct:identifier |
rdfs:Literal |
The main globally unique and persistent identifier of the dataset. Recommended practice is to identify the dataset by means of a string conforming to an identification system such as Digital Object Identifier (DOI). | 1 | https://doi.org/10.34894/ZLOYOJ |
keyword | A keyword or tag describing the Dataset. | dcat:keyword |
rdfs:Literal |
NA | 1..* | NA |
number of records | Size of the dataset in terms of the number of records. | healthdcatap:numberOfRecords |
xsd:NonNegativeInteger |
NA | 1 | NA |
publisher | An entity (organisation) responsible for making the Dataset available. | dct:publisher |
foaf:Agent |
The organization that published the dataset (e.g. the specific UMC in question). Can differ from catalogue publisher. | 1 | Radboud University Medical Center; identifier https://ror.org/05wg1m734 (see foaf: Agent) |
theme | A category of the Dataset. | dcat:theme |
skos:Concept |
A Dataset may be associated with multiple themes. The authority table for Data Themes, maintained by the Publications Office of the European Union is the mandatory controlled vocabulary for dcat:theme. It must have at least the value NAL:data-theme "HEAL" to annotate health datasets. | 1..* | http://publications.europa.eu/resource/authority/data-theme/HEAL |
title | A name given to the Dataset. | dct:title |
rdfs:Literal |
A name given to the Dataset. This property can be repeated for providing names in parallel languages. | 1..* | Questionnaire data of the Personalised RISk-based MAmmascreening Study (PRISMA) |
type | A type of the Dataset. | dct:type |
skos:Concept |
A recommended controlled vocabulary data-type is foreseen, either from the dataset-type authority table or DCMI Type vocabulary. For health datasets containing personal level information, the type of the dataset MUST take the value "personal data". This list of terms provide types of datasets. Its main scope is to support dataset categorisation of the EU Open Data Portal. (To create a new entry for PERSONAL_DATA) | 1 | http://publications.europa.eu/resource/authority/dataset-type/PERSONAL_DATA |
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality | Example |
---|---|---|---|---|---|---|
analytics | An analytics distribution of the dataset. | healthdcatap:analytics |
dcat:Distribution |
Publishers are encouraged to provide URLs pointing to API endpoints or document repositories where users can access or request associated resources such as technical reports of the dataset, quality measurements, usability indicators,... or analytics services. | 0..* | NA |
code values | Health classifications and their codes associated with the dataset. | healthdcatap:hasCodeValues |
skos:Concept |
A dataset may be associated with multiple health classifications. | 0..* | NA |
coding system | Coding systems in use (ex: ICD-10-CM, DGRs, SNOMED-CT, ...). | healthdcatap:hasCodingSystem |
dct:Standard (IRI) |
Wikidata URIs MUST be used. | 0..* | NA |
conforms to | An implementing rule or other specification. | dct:conformsTo |
dct:Standard (IRI) |
Wikidata URIs MUST be used. | 0..* | NA |
data origin | The origin of the data in the data set. | TBA | TBA | This property can be used to indicate whether a dataset contains synthetic or non-synthetic data. To further specify data categories (eg. whole genome seq), healthdcatap:healthCategory (eventually filled with values from a controlled vocabulary) and healthdcatap:healthTheme can be used. | 0..1 | NA |
distribution | An available distribution of the dataset. | dcat:distribution |
dcat:Distribution |
Use this property to point to the distribution of this dataset when a distribution is available. For non-open health datasets, a distribution must include information on the Health Data Access Body supporting data access. | 0..* | NA |
documentation | A page or document about this Dataset. | foaf:page |
foaf:Document (IRI) |
NA | 0..* | NA |
frequency | The frequency at which the Dataset is updated. | dct:accrualPeriodicity |
skos:Concept |
A resource from the following authority table must be used: http://publications.europa.eu/resource/authority/frequency | 0..1 | http://publications.europa.eu/resource/authority/frequency/ANNUAL |
has version | A related Dataset that is a version, edition, or adaptation of the described Dataset. | dcat:hasVersion |
dcat:Dataset |
NA | 0..* | NA |
health category | The health category to which this dataset belongs as described in the Commission Regulation on the European Health Data Space laying down a list of categories of electronic data for secondary use, Art.33. | healthdcatap:healthCategory |
skos:Concept |
A mandatory controlled vocabulary denoting health data within the scope of the Commission Regulation is yet to be created. In the meantime, Health-RI will use substitute entries from Wikidata. | 0..* | NA |
in series | A dataset series of which the dataset is part. | dcat:inSeries |
dcat:DatasetSeries |
NA | 0..* | NA |
is referenced by | A related resource, such as a publication, that references, cites, or otherwise points to the dataset. | dct:isReferencedBy |
rdfs:Resource |
NA | 0..* | NA |
language | A language of the Dataset. | dct:language |
dct:LinguisticSystem |
A language from the following vocabulary: https://publications.europa.eu/resource/authority/language | 0..* | http://publications.europa.eu/resource/authority/language/NLD |
legal basis | The legal basis used to justify processing of personal data. | dpv:hasLegalBasis |
dpv:LegalBasis |
NA | 0..* | NA |
maximum typical age | Maximum typical age of the population within the dataset. | healthdcatap:maxTypicalAge |
xsd:nonNegativeInteger |
NA | 0..1 | NA |
minimum typical age | Minimum typical age of the population within the dataset. | healthdcatap:minTypicalAge |
xsd:nonNegativeInteger |
NA | 0..1 | NA |
modification date | The most recent date on which the Dataset was changed or modified. | dct:modified |
xsd:dateTime |
The value indicates a change to the actual dataset, not a change to the catalog record. An absent value may indicate that the resource has never changed after its initial publication, or that the date of last modification is not known, or that the resource is continuously updated. | 0..1 | 2024-06-04T13:36:10.246Z |
number of unique individuals | Number of records for unique individuals. | healthdcatap:numberOfUniqueIndividuals |
xsd:NonNegativeInteger |
NA | 0..1 | NA |
other identifier | A secondary identifier of the Dataset, such as MAST/ADS17, DataCite18, DOI19, EZID20 or W3ID21. | adms:identifier |
adms:Identifier |
NA | 0..* | NA |
personal data | Key elements that represent an individual in the dataset. | dpv:hasPersonalData |
dpv:PersonalData |
https://w3c.github.io/dpv/2.0/pd/ | 0..* | NA |
population coverage | A definition of the population within the dataset. | healthdcatap:populationCoverage |
rdfs:Literal |
NA | 0..* | NA |
publisher note | A description of the publisher activities. | healthdcatap:publishernote |
rdfs:Literal |
NA | 0..1 | NA |
publisher type | A type of organisation that makes the Dataset available. | healthdcatap:publishertype |
skos:Concept |
A controlled vocabulary is provided, denoting commonly recognised health publishers. | 0..1 | http://purl.org/adms/publishertype/NonGovernmentalOrganisation |
purpose | A free text statement of the purpose of the processing of data or personal data. | dpv:hasPurpose |
dpv:Purpose |
NA | 0..* | NA |
qualified attribution | An Agent having some form of responsibility for the resource. | prov:qualifiedAttribution |
prov:Attribution |
NA | 0..* | NA |
qualified relation | A description of a relationship with another resource. | dcat:qualifiedRelation |
dcat:Relationship |
NA | 0..* | NA |
quality annotation | A statement related to quality of the Dataset, including rating, quality certificate, feedback that can be associated to the dataset. | dqv:hasQualityAnnotation |
dqv:qualityCertificate |
NA | 0..* | NA |
release date | The date of formal issuance (e.g., publication) of the Dataset. | dct:issued |
xsd:dateTime |
NA | 0..1 | NA |
retention period | A temporal period which the dataset is available for secondary use. | healthdcatap:retentionperiod |
dct:PeriodOfTime |
NA | 0..* | NA |
sample | A sample distribution of the dataset. | adms:sample |
dcat:Distribution |
NA | 0..* | NA |
source | A related dataset from which the described dataset is derived. | dct:source |
dcat:Dataset |
NA | 0..* | NA |
status | The status of a dataset. | adms:status |
skos:Concept |
A resource from the authoroty table must be used https://publications.europa.eu/resource/authority/dataset-status | 0..* | http://publications.europa.eu/resource/authority/dataset-status/COMPLETED |
temporal coverage | A temporal period that the Dataset covers. | dct:temporal |
dct:PeriodOfTime |
NA | 0..* | NA |
temporal resolution | The minimum time period resolvable in the dataset. | dcat:temporalResolution |
xsd:duration |
The minimum time period resolvable in the dataset. | 0..1 | NA |
version | The version indicator (name or identifier) of a resource. | dcat:version |
rdfs:Literal |
NA | 0..1 | NA |
version notes | A description of the differences between this version and a previous version of the Dataset. | adms:versionnotes |
rdfs:Literal |
This property can be repeated for parallel language versions of the version notes. | 0..* | NA |
was generated by | An activity that generated, or provides the business context for, the creation of the dataset. | prov:wasGeneratedBy |
prov:Activity |
NA | 0..* | NA |
was used by | TBA | prov:wasUsedBy |
prov:Activity |
NA | 0..* | NA |
A collection of datasets that are published separately, but share some characteristics that group them.
Please note: Dataset Series inherits its properties from the Dataset class. This means when you describe Dataset Series, refer to properties listed above, under Dataset class.
A collection of operations that provides access to one or more datasets or data processing functions.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality | Example |
---|---|---|---|---|---|---|
access rights | Information regarding access or restrictions based on privacy, security, or other policies. | dct:accessRights |
Rights Statement (IRI) | Information that indicates whether the Dataset is publicly accessible, has access restrictions or is not public. Use one of the following values from this vocabulary (:public, :restricted, :non-public). | 1 | http://publications.europa.eu/resource/authority/access-right/RESTRICTED |
contact point | Contact information that can be used for sending comments about the Data Service. | dcat:contactPoint |
vcard:Kind |
NA | 1 | mailto: data-access-committee@xumc.nl with name Data Access Committee of the x UMC (see vcard:Kind) |
description | A free-text account of the Data Service. | dct:description |
rdfs:Literal |
A free-text informative description of the data service. This property can be repeated for providing descriptions in different languages. | 1..* | NA |
end point URL | The root location or primary endpoint of the service (a Web-resolvable IRI). | dcat:endPointURL |
IRI |
NA | 1 | NA |
identifier | A unique identifier of the resource being described or catalogued. | dct:identifier |
rdfs:Literal |
NA | 1 | NA |
license | A licence under which the Data service is made available. | dct:license |
dct:LicenseDocument |
NA | 1 | NA |
publisher | An entity (organisation) responsible for making the Data Service available. | dct:publisher |
foaf:Agent |
NA | 1 | name: Radboud University Medical Center identifier: https://ror.org/05wg1m734 (see class foaf: Agent) |
theme | A category of the Data Service. | dcat:theme |
skos:Concept |
A Data Service may be associated with multiple themes. | 1..* | NA |
title | A name given to the Data Service. | dct:title |
rdfs:Literal |
NA | 1..* | NA |
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality | |
---|---|---|---|---|---|---|
applicable legislation | The legislation that mandates the creation or management of the Data Service. | dcatap:applicableLegislation |
eli:LegalResource |
TBA | 0..* | NA |
application profile | An established (technical) standard to which the Data Service conforms. | dct:conformsTo |
dct:Standard |
The standards referred here SHOULD describe the Data Service and not the data it serves. The latter is provided by the dataset with which this Data Service is connected. For instance the data service adheres to the OGC WFS API standard, while the associated dataset adheres to the INSPIRE Address data model. | 0..* | |
creator | The entity responsible for producing the resource. | dct:creator |
foaf:Agent |
NA | 0..* | |
end point description | A description of the services available via the end-points, including their operations, parameters etc. | dcat:endpointDescription |
rdfs:Literal |
The property gives specific details of the actual endpoint instances, while dct:conformsTo is used to indicate the general standard or specification that the endpoints implement. | 0..* | |
format | The structure that can be returned by querying the endpointURL. | dct:format |
dct:MediaType or Extent |
Use the term from the authority table: https://publications.europa.eu/resource/authority/file-type | 0..* | |
HVD Category | A data category defined in the High Value Dataset Implementing Regulation. | dcatap:hvdCategory |
skos:Concept |
For the possible values consult the regulation at http://data.europa.eu/eli/reg_impl/2023/138/oj. Or consult the controlled vocabulary derived from it. | 0..* | |
keyword | A keyword or tag describing the Data Service. | dcat:keyword |
rdfs:Literal |
NA | 0..* | |
landing page | A web page that provides access to the Data Service and/or additional information. | dcat:landingPage |
foaf:Document |
It is intended to point to a landing page at the original data service provider, not to a page on a site of a third party, such as an aggregator. | 0..* | |
language | A language of the Data Service. | dct:language |
dct:LinguisticSystem |
A language from the following authority table: https://publications.europa.eu/resource/authority/language | 0..* | |
modification date | Most recent date on which the catalog entry was changed, updated or modified. | dct:modified |
xsd:dateTime |
NA | 0..1 | |
other identifier | Any other identifiers in addition to the identifier. | adms:identifier |
adms:Identifier |
NA | 0..* | |
rights | A statement that specifies rights associated with the Data Service. | dct:rights |
dct:RightsStatement |
NA | 0..* | |
serves dataset | This property refers to a collection of data that this data service can distribute. | dcat:servesDataset |
dcat:Dataset |
NA | 0..* |
An available distribution of the dataset.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality | Example |
---|---|---|---|---|---|---|
access URL | A URL that gives access to a Distribution of the Dataset. | dcat:accessURL |
IRI |
This property contains a URL that gives access to a Distribution of the Dataset. The resource at the access URL may contain information about how to get the Dataset. | 1 | NA |
applicable legislation | The legislation that mandates the creation or management of the Distribution. | dcatap:applicableLegislation |
eli:LegalResource |
TBA | 1..* | NA |
license | A licence under which the Distribution is made available. | dct:license |
dct:LicenseDocument |
This should contain a URL that provides details regarding the license that is applicable to this dataset (open data commons, data access policy link etc.) | 1 | NA |
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality | Example |
---|---|---|---|---|---|---|
access service | A data service that gives access to the distribution of the dataset | dcat:accessService |
dcat:DataService |
dcat:accessService SHOULD be used to link to a description of a dcat:DataService that can provide access to this distribution. |
0..1 | NA |
byte size | The size of a Distribution in bytes. | dcat:byteSize |
xsd:nonNegativeInteger |
NA | 0..1 | NA |
checksum | A mechanism that can be used to verify that the contents of a distribution have not changed. | spdx:checksum |
spdx:Checksum |
The checksum is related to the downloadURL. | 0..1 | NA |
compression format | The format of the file in which the data is contained in a compressed form, e.g. to reduce the size of the downloadable file. | dcat:compressFormat |
dct:MediaType |
It SHOULD be expressed using a media type as defined in the official register of media types managed by IANA. | 0..1 | NA |
description | A free-text account of the distribution. | dct:description |
rdfs:Literal |
This property can be repeated for parallel language versions of the description. | 0..* | NA |
documentation | A page or document about this Distribution. | foaf:page |
foaf:Document (IRI) |
NA | 0..* | NA |
download URL | A URL that is a direct link to a downloadable file in a given format. | dcat:downloadURL |
IRI |
NA | 0..1 | NA |
format | The file format of the Distribution. | dct:format |
dct:MediaType or Extent |
Use the term from the authority table: https://publications.europa.eu/resource/authority/file-type | 0..1 | http://publications.europa.eu/resource/authority/file-type/TSV |
language | A language used in the Distribution. | dct:language |
dct:LinguisticSystem (IRI) |
This property can be repeated if the metadata is provided in multiple languages. Use a term from the authority table: http://publications.europa.eu/resource/authority/language | 0..* | NA |
linked schemas | An established schema to which the described Distribution conforms. | dct:conformsTo |
dct:Standard (IRI) |
NA | 0..* | NA |
media type | The media type of the distribution as defined by IANA [IANA-MEDIA-TYPES]. | dcat:mediaType |
IRI |
This property SHOULD be used when the media type of the distribution is defined in IANA [IANA-MEDIA-TYPES], otherwise dcterms:format MAY be used with different values. |
0..1 | https://www.iana.org/assignments/media-types/text/csv |
modification date | The most recent date on which the Distribution was changed or modified. | dct:modified |
xsd:dateTime |
NA | 0..1 | NA |
packaging format | The format of the file in which one or more data files are grouped together, e.g. to enable a set of related files to be downloaded together. | dcat:packageFormat |
dct:MediaType |
It SHOULD be expressed using a media type as defined in the official register of media types managed by IANA. | 0..1 | NA |
release date | The date of formal issuance (e.g., publication) of the Distribution. | dct:issued |
xsd:dateTime |
NA | 0..1 | NA |
retention period | The minimum time period resolvable in the dataset distribution. | healthdcatap:retentionperiod |
dct:PeriodOfTime |
NA | 0..* | NA |
rights | A statement that specifies rights associated with the Distribution. | dct:rights |
dct:RightsStatement |
A statement that concerns all rights not addressed in fields License or Rights, such as copyright statements. Everything that is not covered with license | 0..1 | NA |
status | The status of the distribution in the context of maturity lifecycle. | adms:status |
skos:Concept |
It MUST take one of the values Completed, Deprecated, Under Development, Withdrawn. Use a term from the authority table: https://publications.europa.eu/resource/authority/distribution-status | 0..1 | NA |
temporal resolution | The minimum time period resolvable in the dataset distribution. | dcat:temporalResolution |
xsd:duration |
NA | 0..1 | NA |
title | A name given to the Distribution. | dct:title |
rdfs:Literal |
This property can be repeated for providing names in parallel languages. | 0..* | NA |
A collective endeavour of some kind. The Project class represents the class of things that are 'projects'. These may be formal or informal, collective or individual. It is often useful to indicate the homepage of a Project.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
catalogue | TBA | dcat:resource |
dcat:Catalog |
NA | 1..* |
description | A free-text account of the Project. | dct:description |
rdfs:Literal |
NA | 1..* |
funder | The funding agent providing funding for the project | foaf:fundedBy |
foaf:Agent |
NA | 1..* |
identifier | A unique identifier of the project. | dct:identifier |
rdfs:Literal |
NA | 1 |
title | A title of the project. | dct:title |
rdfs:Literal |
NA | 1..* |
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
study | A study that is performed in the context of the project. | dct:hasPart |
Study |
NA | 0..* |
A Study represents the process by which a data set was generated or collected.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
dataset | The dataset that was generated as a result of this study. | prov:generated |
dcat:Dataset |
NA | 1..* |
description | A free text desription of the study.. | dct:description |
rdfs:Literal |
NA | 1..* |
identifier | A unique identifier of the study. | dct:identifier |
rdfs:Literal |
NA | 1 |
project | The project of which this study is a part. | dct:isPartOf |
foaf:Project |
NA | 1 |
title | The title of the study. | dct:title |
rdfs:Literal |
NA | 1..* |
There are currently no recommended properties for this class.
An entity that is associated with catalog and/or Datasets. Agent can be individuals or organisations, If the Agent is an organisation, the use of the Organization Ontology is recommended.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
identifier | A unique identifier of the agent. | dct:identifier |
rdfs:Literal |
A unique identifier of a person or organisation being described, like ORCID for a researcher or ROR for an organization. | 1..1 |
name | A name of the agent. | foaf:name |
rdfs:Literal |
This property contains a name of the agent. This property can be repeated for different versions of the name (e.g. the name in different languages) | 1..* |
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality | |
---|---|---|---|---|---|---|
country | Country of the agent. | dct:spatial |
dct:Location |
Point to the country code URL from Geonames. | 0..* | https://www.geonames.org/2759794/amsterdam.html |
A email address via which contact can be made. This property SHOULD be used to provide the email address of the Agent, specified using fully qualified mailto: URI scheme [RFC6068]. The email SHOULD be used to establish a communication channel to the agent. | foaf:mbox |
rdfs:Resource |
NA | 0..* | ||
type | A type of the agent that makes the Catalogue or Dataset available. | dct:type |
skos:Concept |
Property should be described using ADMS vocabulary | 0..1 | |
URL | A webpage that either allows to make contact (i.e. a webform) or the information contains how to get into contact. | foaf:homepage |
rdfs:Resource |
NA | 0..1 |
Contact information of the contact point for Dataset and DatasetSeries.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
formatted name | The full name of the contact point. | vcard:fn |
xsd:string |
NA | 1 |
has email | A email address via which contact can be made. | vcard:hasEmail |
rdfs:Resource |
NA | 1 |
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
contact page | A webpage that either allows to make contact (i.e. a webform) or the information contains how to get into contact. | vcard:hasURL |
rdfs:Resource |
NA | 0..* |
A value that allows the contents of a file to be authenticated. This class allows the results of a variety of checksum and cryptographic message digest algorithms to be represented.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
algorithm | The algorithm used to produce the subject Checksum. | spdx:algorithm |
spdx:ChecksumAlgorithm |
NA | 1 |
checksum value | A lower case hexadecimal encoded digest value produced using a specific algorithm. | spdx:checksumValue |
xsd:hexBinary |
NA | 1 |
There are currently no recommended properties for this class.
An interval of time that is named or defined by its start and end dates.
There are currently no mandatory properties for this class.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
end date | The end of the period. | dcat:endDate |
xsd:dateTime |
NA | 0..1 |
start date | The start of the period. | dcat:startDate |
xsd:dateTime |
NA | 0..1 |
A description of a Catalogued Resource's entry in the Catalogue.
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
language | A language used in the textual metadata describing titles, descriptions, etc. of the Dataset. | dct:language |
dct:LinguisticSystem |
This property can be repeated if the metadata is provided in multiple languages. | 1..* |
modification date | The most recent date on which the Catalogue entry was changed or modified. | dct:modified |
xsd:dateTime |
NA | 1 |
primary topic | A link to the Dataset, Data service or Catalog described in the record. | foaf:primaryTopic |
dcat:Resource |
A catalogue record will refer to one entity in a catalogue. This can be either a Dataset or a Data Service. To ensure an unambigous reading of the cardinality the range is set to Catalogued Resource. However it is not the intend with this range to require the explicit use of the class Catalogued Record. As abstract class, an subclass should be used. | 1 |
Property name | Definition | URI | rdfs:Range | Usage Note | Cardinality |
---|---|---|---|---|---|
application profile | An Application Profile that the Dataset's metadata conforms to. | dct:conformsTo |
dct:Standard |
NA | 0..1 |
change type | The status of the catalogue record in the context of editorial flow of the dataset and data service descriptions. | adms:status |
skos:Concept |
NA | 0..1 |
description | A free-text account of the record. This property can be repeated for parallel language versions of the description. | dct:description |
rdfs:Literal |
NA | 0..* |
listing date | The date on which the description of the Dataset was included in the Catalogue. | dct:issued |
xsd:dateTime |
NA | 0..1 |
source metadata | The original metadata that was used in creating metadata for the Dataset. | dct:source |
dcat:CatalogRecord |
NA | 0..1 |
title | A name given to the Catalogue Record. | dct:title |
rdfs:Literal |
This property can be repeated for parallel language versions of the name. | 0..* |
All things described by RDF are called resources, and they are instances of the class dcat:Resource
. This is the class of everything. All other classes are subclasses of this class.
Within DCAT and DCAT-AP, the term "resource" generally encompasses all objects that can be described using RDF. However, there are specific categories and attributes used to indicate the different types of resources:
dcat:Dataset
is a type of dcat:Resource
representing a collection of data
dcat:Distribution
is a type of dcat:Resourcee
representing an available form or representation of a dataset.
dcat:Catalog
is a type of dcat:Resource
representing a collection of datasets.
dcat:DataService
, introduced in DCAT version 2, is a type of Resource representing a service for accessing data.
In DCAT and DCAT-AP, the vocabulary is focused on datasets. Nonetheless, users may need to portray a variety of resources specific to certain domains, like biobanks or patient registries. In such cases, we propose potential scenarios for modifying or augmenting DCAT to accurately depict your resource type:
Use dcat:Resource
directly: If the asset you are dealing with is not in line with the dcat:Dataset
definition, you can use the broader term dcat:Resource
. This term allows you to represent almost any type of asset. However, this approach may not be completely clear for users who are trying to understand the essence of the asset. We can de define the asset type further with specific vocabularies over time.
Expand with Personalised Classes: If there is a need to represent specific resources, such as biobanks or patient registries, it may be beneficial to supplement the foundational DCAT vocabulary with custom classes. For example:
:Collection a rdfs:Class ;
rdfs:subClassOf dcat:Resource .
and
:PatientRegistry a rdfs:Class ;
rdfs:subClassOf dcat:Dataset .
When creating custom classes, it is essential to provide detailed metadata for each type of resource. This will enable users and systems to distinguish between them and comprehend their subtle differences. For instance, consider the distinction between a collection and a dataset. Therefore, it is crucial to provide specific and unambiguous information to ensure complete understanding.