openownership / data-standard

The Beneficial Ownership Data Standard (BODS) is an open standard providing a specification for modelling and publishing information on the beneficial ownership and control of corporate vehicles
http://standard.openownership.org
Other
57 stars 12 forks source link

Feature: Republishing, provenance and transformation #464

Open lgs85 opened 1 year ago

lgs85 commented 1 year ago

[This ticket helps track progress towards developing a particular feature in BODS where changes or revisions to the standard may be required. It should be placed on the BODS Feature Tracker, under the relevant status column.

_See Feature development in BODS in the Handbook._

The title of this GitHub ticket should be 'Feature: XXXXX' where XXXXX is the feature name below. The information in this first post on the thread should be updated as necessary so that it holds up-to-date information. Comments on this ticket can be used to help track high-level work towards this feature or to refine this set of information.]

Feature name: Republishing, provenance and transformation

Feature background

Briefly describe the purpose of this feature

This feature ticket proposes that BODS should provide scope for representing republished data derived from multiple sources, and should enable transparency about the provenance of republished or mapped BODS data, alongside any transformation steps taken.

BODS is tailored almost exclusively towards primary beneficial ownership data, consisting of newly published or updated statements by declaring entities to state bodies. This state-level beneficial ownership data is stored and sometimes published in a register, and usually consists solely of declarations by entities registered in the state which maintains the register.

A number of organizations are already republishing beneficial ownership data from multiple sources. Most notably, the Open Ownership Register combines beneficial ownership data from the United Kingdom, Denmark, Slovakia and Ukraine, reconciles and deduplicates this with data from OpenCorporates, and republishes the reconciled data in BODS format.

The Open Ownership Register is currently being upgraded to publish in BODS v0.2, but neither v0.2 nor v0.3 enable source information from multiple sources to be represented, because the section of BODS that deals with source information is designed around the idea of a single source.

What user needs are met by introducing or developing this feature in BODS?

A key reason for developing any data standard is to enable interoperability, and with beneficial ownership data, being able easily represent reconciled and republished statements from multiple sources meets a large and distinct set of use needs. User stories include:

What impact would not meeting these needs have?

How urgent is it to meet the above needs?

At present there are relatively few public registers of beneficial ownership data and as a result enhancing and republishing beneficial ownership data in BODS format can be done using slight adaptations to the standard (e.g. using the source.description field). This slightly reduces the urgency of modifying BODS to explicitly accommodate republished data, but as more countries start publishing public beneficial ownership registers, this will become more urgent.

Are there any obvious problems, dependencies or challenges that any proposal to develop this feature would need to address?

Feature work tracking

StephenAbbott commented 7 months ago

See https://github.com/openownership/register/issues/223 for republishing issue raised by work on Open Ownership Register

StephenAbbott commented 2 months ago

Check ISO 8000-120:2016 which "specifies requirements for the representation and exchange of information about the provenance of master data that consists of characteristic data, and supplements the requirements of ISO 8000‑110"

StephenAbbott commented 1 month ago

Noting issue https://github.com/openownership/data-standard/issues/638 raised by @kathryn-ods to consider adding 'cleaning' and 'enrichment' to the motivations codelist which may be relevant to this feature development in future