openownership / data-standard

The Beneficial Ownership Data Standard (BODS) is an open standard providing a specification for modelling and publishing information on the beneficial ownership and control of corporate vehicles
http://standard.openownership.org
Other
61 stars 13 forks source link

Missing information: publisher codelists #247

Open ScatteredInk opened 5 years ago

ScatteredInk commented 5 years ago

We have a reason but might also want to allow publishers to use short codelists, as with Companies House enumerations.

stevenday commented 5 years ago

On the related JIRA issue for the register's bods export, you mentioned a missingInfoCode field. However, I was thinking this might be better as a nested object with a 'reason' and 'description' (or better names) similar to how we structure unspecified relationships? It might help a little to standardise these two connected but separate sections of the standard?

ScatteredInk commented 5 years ago

Yes - we moved missing info reasons to a nested structure in 0.2:

https://github.com/openownership/data-standard/blob/3237fd3feee6e63c52b46a9acaf698ae75f41d54/schema/ownership-or-control-statement.json#L108-L139

There is a similar nested object when the exemption or missing data is at entity level.

So I think the question is whether we want a structure like:

reason - a required field drawn from the BODS closed codelist originalReason - an optional open codelist drawn from the source system that maps to the closed BODS codelist description - an optional human-readable description, either of the codes in the open codelist or an inferred description of why the data is missing

originalReason gives analysts a quick way to search based on original data. description is readable, self-documenting but also subject to change (evidence: the Companies House repo) and harder to do analysis on. But we might want to keep it so that we understand some kinks in the data later on.