Open jakubklimek opened 1 year ago
Perhaps some inspiration can be taken from GeoDCAT-AP where both dcat:theme
and dcterms:subject
are used. See the controlled vocabulary section for an overview.
There is a bit of discussing in B.6.8 regarding why different expressions have been mapped to either dcat:theme
or dcterms:subject
. However, I fail to see a clear argument why which property should be used beyond the need to separate use of each vocabulary (which will fail if we also have a dcat:theme on Data Services, which I think we should have).
Also in BRegDCAT-AP, both controlled vocabularies (Data theme and Eurovoc) are available for Dataset:
I propose to take this out of the release of DCAT-AP 3.
This discussion is also about the interpretation of expressing additional constraints on properties.
E.g. Is there a difference in the expectation of the MUST in the expression that the property bankAccountNR must be an IBAN structure (which could be expressed as a literal with a specific regex expression) and the expression that a property theme must adhere to a the NAL:datatheme. ?
SKOS have skos:ConceptScheme, to define if a concept is part of a controlled vocabulary or not.
For example in EU PO data-theme controlled vocabulary each concept specifies scheme of which a concept is part of:
<rdf:Description rdf:about="http://publications.europa.eu/resource/authority/data-theme/AGRI">
<skos:inScheme rdf:resource="http://publications.europa.eu/resource/authority/data-theme"/>
<skos:topConceptOf rdf:resource="http://publications.europa.eu/resource/authority/data-theme"/>
</rdf:Description>
I don't understand, why vocabulary restriction is put on dcat:theme
, as I understand, vocabulary should be set using skos:inScheme
.
Validators then could only consider those concepts, that are in specific scheme (controlled vocabulary), specified with skos:inScheme
.
I also agree with @jakubklimek https://github.com/SEMICeu/DCAT-AP/issues/314#issuecomment-1765711454, if subproperties are required for dcat:theme
, then dcat:theme
itself should not have vocabulary restriction and a :semicTheme
subproperty should also be used to enforce a vocabulary.
Based on https://github.com/SEMICeu/DCAT-AP/issues/314, https://github.com/SEMICeu/DCAT-AP/issues/207, and the description of
dcat:theme
, the chapter on Other controlled vocabularies, I think there is a need for clarification of usage of additional dataset themes, including examples. It is clear that Data theme vocabulary needs to be used fordcat:theme
. What is unclear from the current state of DCAT-AP 3.0.0 and where the discussions are not yet concluded is how additional themes should be used. Let's say I want to use Eurovoc in addition to Dataset theme. What do I do?The usage note for
dcat:theme
says: "The values to be used for this property are the URIs of the concepts in the vocabulary." It is unclear whether it is ONLY values from the vocabulary, or AT LEAST ONE value from this vocabulary.Option 1 (implemented in the Czech National Open Data Catalog): Use
dcat:theme
also for Eurovoc, e.g.:This seems to be discouraged by @bertvannuffelen in https://github.com/SEMICeu/DCAT-AP/issues/207#issuecomment-1700613026 and creation of subproperties of
dcat:theme
and enforcing the ONLY values from the dataset theme vocabulary policy is suggested. However, as I mentioned in https://github.com/SEMICeu/DCAT-AP/issues/314#issuecomment-1765711454, I do not think that these two approaches go together, as from the RDF point of view, the values of a subproperty can be interpreted also as values of the superproperty, i.e.dcat:theme
, violating the constraint.Option 2 Other vocabularies use
dct:subject
:This is another approach suggested by @bertvannuffelen in https://github.com/SEMICeu/DCAT-AP/issues/314#issuecomment-1764792636, which does not create any problems. However, it is not mentioned anywhere in DCAT-AP.
I think this shows the need for a decision and a clearer guidance on how additional dataset themes should be used in DCAT-AP.