SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
72 stars 24 forks source link

HVD: Dataset applicable legislation #308

Closed idevisser closed 6 months ago

idevisser commented 9 months ago

Does it make sense to record information about the legislation that mandates the creation or management of the Data Service at dataset level?

bertvannuffelen commented 9 months ago

In order to clarify the reporting requirements of HVD, applicationLegislation is indeed necessary for all elements that a publisher wants to be taken into account for the HVD reporting.

We start from the assumption that one has an existing catalogue of datasets and services. And that only some of these are to be considered in scope of the HVD IR. To provide fine grained annotation without the need of (complex) inference agreements which can lead to misinterpretations, and therefore non-compliance to HVD IR, this is necassary.

This is also a consequence of that in the DCAT ecosystem as RDF knowledge graph there is no "existence" dependency between entities. It is not because one states that the a data service is providing access to a dataset that the existence of these are dependent. That is an interpretation. RDF does not have a technical boundary in contrast to XML. In XML ecosystems, the existence of an connected entity is often bound by the embedding in the XML tree. Anything that is outside the XML tree is not connected. That "scoping" approach is not part of RDF, and is very hard to enforce as part of the structures in the RDF knowledge graph. Even though RDF based on DCAT might result in a directed graph, it is a graph of which each node has its own lifecycle.

As vocabulary DCAT is not enforcing a particular scoping, nor a fixed "dependency" interpretation. This is followed by DCAT-AP. DCAT-AP HVD also shows that this "dependency" interpretation might differ from application context to application context. It is dangerous for DCAT-AP profiles to impose inferred assumptions as data created in another application context is likely not to follow these.

All this in in line with the HVD IR legislation: it explicitly states minimum requirements. E.g. there should be an API providing data in INSPIRE format (for geospatial domain). That does not exclude the existence of another API providing data in an alternative way.
Instead of creating duplicates of the metadata, DCAT-AP HVD suggests to strengthen the existing metadata approaches in such a way that both APIs are documented properly, but that only the INSPIRE API is annotated as part of the HVD IR.

Of-course it does not prevent to program a propagation effect of the HVD IR scoping to all connected services and distributions for a dataset. It is not for DCAT-AP HVD to make this kind of interpretations as this is a legal interpretation which is not explicit in the HVD IR.

I hope this is an answer to your question.

idevisser commented 8 months ago

Thanx for the explanation, but the issue can be simple; The definition of applicationLegislation in paragraph 7.6 is "The legislation that mandates the creation or management of the Data Service." In paragraph 7.6 the properties of the dataset are described, so this definition seems not to fit on this level. The definition could be something like " The legislation that mandates the creation, publication or management of the DataSet." Then this issue is solved. (i added publication because there is no requirement to create data in the context of HVD)