SEMICeu / DCAT-AP

This is the issue tracker for the maintenance of DCAT-AP
https://joinup.ec.europa.eu/solution/dcat-application-profile-data-portals-europe
74 stars 24 forks source link

Missing controlled vocabularies for newly added properties #95

Closed jakubklimek closed 4 years ago

jakubklimek commented 4 years ago

The newly added properties are missing specifications of controlled vocabularies in Section 5 Controlled Vocabularies in DCAT-AP 2.0.0.

bertvannuffelen commented 4 years ago

proposed resolution: to be added to the specification

init-dcat-ap-de commented 4 years ago

dcatap:availability (thanks for adding) should have URIs too

image

sandervd commented 4 years ago

Would it make sense to define controlled vocabulary consisting of a subset of the mediaType vocabulary, with the appropriate concepts?

jakubklimek commented 4 years ago

Would it make sense to define controlled vocabulary consisting of a subset of the mediaType vocabulary, with the appropriate concepts?

why a subset? If it was a subset of types that we deem appropriate as compression/package formats, then someone would have to manage the subset in case there are changes/additions. I think it is safer to go either with IANA MediaTypes (which are, however, missing e.g. tar - can this be added there?) or the existing file format controlled vocabulary.

sandervd commented 4 years ago

Hi @jakubklimek,

why a subset? If it was a subset of types that we deem appropriate as compression/package formats, then someone would have to manage the subset in case there are changes/additions.

Indeed, the list would need to be managed. My concern is the point you made in #100: 'Lots of freedom means low interoperability.' Specifically in this case, it would leave it up to each application to decide on what archive formats should be supported.

I think it is safer to go either with IANA MediaTypes (which are, however, missing e.g. tar - can this be added there?) or the existing file format controlled vocabulary.

I agree that it is less risk on the AP side, but it would be shifting the burden to the implementations?

costezki commented 4 years ago

Concerning mime types (file types)

A proposal would be to adopt a constraint validating whether the range value is a valid mime type or not. This constraint can be applied to the three properties concerned: dcat:mediaType, dcat:compressFormat, dcat:packageFormat. FileType controlled vocabulary/authority table published by the Publications Office (OP) on EU Vocabularies is a reasonable way to go about it (although the completeness of this list is questionable and I address it below).

In case such a constraint is considered too broad or permissive, then further partitions of the File Type controlled vocabulary may be envisaged (e.g. the group of compression formats, text formats, modelling formats, etc.) and applied accordingly. The currently adopted model for expressing OP controlled vocabularies supports provisions of (sub-)classifications and partitions.

The FileType published by the OP currently comprises of a reduced list if compared to that made available by IANA (although the most frequently used types are already covered). The current work on the DCAT-AP specification, nonetheless, is an excellent occasion for the OP to update the FileType table and extend its coverage. So, we would consider enriching this table anyway for the next publication in December.

bertvannuffelen commented 4 years ago

resolution: since in 1.1 the PO filetype was replaced with the IANA codelist for dcat:mediaType it is logical to apply the same constraint for properties having the same range.