SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
72 stars 24 forks source link

HVD: publisher property Data Service #305

Closed idevisser closed 6 months ago

idevisser commented 9 months ago

I am missing the publisher property for a data Service. The publisher of the data service can be an other organisation then the publisher of the data itself.

bertvannuffelen commented 9 months ago

This is not explicitly present in DCAT-AP 2.1.0 for Data Service, and that has been taken over in DCAT-AP 3.0.0

However DCAT-AP takes the following approach explained in the last paragraph of overview.

Since the property is not explicitly listed DCAT 3.0 guidelines applies. There publisher is a property of a Catalogued Resource and thus is can be used to denote the publisher of a Data Service.

Is your request also to impose additional constraints on the publisher of a dataset?

idevisser commented 8 months ago

I should make the the publisher property recommend for dataServices like is done for Datasets.

Other point to be consistent in the properties needed for Datasets en dataService; make contactPoint a mandatory Dataset property, the API's can be made available as distribution. The contact information is in that case required.

bertvannuffelen commented 8 months ago

@idevisser, can you explain the sentence below?

Other point to be consistent in the properties needed for Datasets en dataService; make contactPoint a mandatory Dataset property, the API's can be made available as distribution. The contact information is in that case required.

Note A) in DCAT-AP HVD we intend only to capture the requirements that HVD is imposing. So if HVD does not require contactpoints everywhere to be present, then I would not deviate from that. This you can add in your local metadata requirements.

Note B) that in DCAT-AP we explicitly rely on the available DCAT term of Data Service for APIs. We are not going back to a w3c DCAT-1 interpretation where Distributions are a representation of everything. This distinction was one of the main reasons for the w3c DCAT 2 revision. So I would refrain from returning back in time.

idevisser commented 8 months ago

It is a matter of what background you have in how certain things are experienced; From an INSPIRE perspective, we go back in time when we include separate metadata for each service, as required by INSPIRE. Much of the service metadata is already available in the capabilities or open API description. Separate service metadata, causes inconsistencies and ambiguity. For INSPIRE data there is now the option to meet the INSPIRE requirements, without separate service metadata conform ISO 19119 in XML. The access to the service is provided via the distributions in the dataset metadata. From this perspective it is necessary to require a contactPoint in the dataset or distribution properties.

bertvannuffelen commented 8 months ago

I think your feedback contains several aspects:

A) You hint at a fundamental discussion on the usability of DataService descriptions. Now in DCAT the service notion has been motivated by the GeoSpatial community. That has been adopted by W3C DCAT 5 years back. Now you tell me that the INSPIRE community has decided that the distinction between services and filebased distributions is not meaningful. I would like to hear the reasons.

But as DCAT-AP community we cannot ignore DCAT and the base model it presents. In that there are Catalogues, Datasets, Distributions, DataServices and DatasetSeries. And thus DCAT-AP gives the best use of them.

B) Like I stated in several responses: the DCAT-AP HVD is how the HVD requirements are represented in DCAT(-AP).

The HVD explicitly requires to share API endpoints for HVD datasets. In DCAT-AP this API endpoint is represented by an DataService. If current INSPIRE requirements allow distributions only, that is fine for INSPIRE, but INSPIRE-HVD must provide a way how to indicate the API endpoint for the HVD dataset. Thus here the HVD IR requires an adjustment to the INSPIRE guidelines to make that feasible.

So the fact DCAT-AP HVD explicitly uses DataServices for APIs is exactly the requirement that is imposed to INSPIRE metadata: how does INSPIRE metadata indicates the HVD API endpoint that is to be provided for a dataset.(preferably only one per dataset ...)? So how out of the list of distributions in the INSPIRE metadata a user would know what is the API endpoint for HVD. Although I do not know the reporting method to the EC for HVD, we could for the sake of discussion assume there is a table to be created |Dataset ID | API | . This reporting enforces you to identify in the INSPIRE metadata the API for the dataset. Also similary, the HVD asks for the bulk download. Not to 100s of distributions, but only the bulk download.

I hope I have clarified that the requirement for Data Service in DCAT-AP HVD is not only based on a conformance with W3C DCAT, but also a representation of a requirement in the HVD IR. And thus a question that has to be resolved within the INSPIRE-HVD guidelines.