Publishers and consumers of Company-related data published as HVDs (High-value datasets)
Problem statement
Currently, the Core Business Vocabulary (even together with other Core Vocabularies) does not offer support for all data items required do be published in the implementing act of HVDs, specifically the annex, part 5 - Companies and company ownership.
It means that each country will publish this data in their own way, making the result uninteroperable. The Core Business Vocabulary, however, seems like the perfect place to define the technical details supporting interoperable publication of such datasets.
As a publisher, I would like to see a specification telling me how exactly (technically) to publish data about "Companies and company ownership" in terms of
data structure, i.e. RDF classes and predicates
code lists used
data formats (RDF Turtle, JSON-LD, CSV + CSV on the Web, ...)
APIs (SPARQL endpoint, #LD, LDES, IRI dereference)
Metadata description so that I can find the compliant data in data portals such as data.europa.eu
As a consumer, I would like to see the same so that I can find the data and work with data from individual countries as if they were one dataset, with no additional integration and transformation effort.
3 and 4 is probably out of scope of Core Business Vocabulary, 5 is in scope of DCAT-AP, but 1 and 2 can be done here.
Existing approaches
An attempt at extension of the Core Vocabularies to support the additional items and codelists has been made as part of the STIRData project, specifically in the STIRData data model. The project aims at exactly what needs to be done "officially", within Core vocabularies and EU Vocabularies for shared codelists - creating a technical specification which, when adhered to by publishers, ensures technical and semantic interoperability of company-related datasets.
Use case name
Companies and company ownership HVDs
Please insert the status of the use case
Under Development
Use case creator
Jakub Klímek
Stakeholders
Publishers and consumers of Company-related data published as HVDs (High-value datasets)
Problem statement
Currently, the Core Business Vocabulary (even together with other Core Vocabularies) does not offer support for all data items required do be published in the implementing act of HVDs, specifically the annex, part 5 - Companies and company ownership.
It means that each country will publish this data in their own way, making the result uninteroperable. The Core Business Vocabulary, however, seems like the perfect place to define the technical details supporting interoperable publication of such datasets.
As a publisher, I would like to see a specification telling me how exactly (technically) to publish data about "Companies and company ownership" in terms of
As a consumer, I would like to see the same so that I can find the data and work with data from individual countries as if they were one dataset, with no additional integration and transformation effort.
3 and 4 is probably out of scope of Core Business Vocabulary, 5 is in scope of DCAT-AP, but 1 and 2 can be done here.
Existing approaches
An attempt at extension of the Core Vocabularies to support the additional items and codelists has been made as part of the STIRData project, specifically in the STIRData data model. The project aims at exactly what needs to be done "officially", within Core vocabularies and EU Vocabularies for shared codelists - creating a technical specification which, when adhered to by publishers, ensures technical and semantic interoperability of company-related datasets.
Links
Requirements
Related use cases
No response
Comments
The need to focus on the technical and semantic interoperability on the dataset level of HVDs, not only on the metadata level, has been formulated in a position paper of the Czech Republic, which was officially sent to CNECT Unit G.1 Data Policy and Innovation in the beginning of 2020.
In addition, I personally talked about the need on the SEMIC 2019 Linked Data Showcase panel.