openownership / data-standard

The Beneficial Ownership Data Standard (BODS) is an open standard providing a specification for modelling and publishing information on the beneficial ownership and control of corporate vehicles
http://standard.openownership.org
Other
63 stars 13 forks source link

Carry out an audit of required fields #470

Open lgs85 opened 1 year ago

lgs85 commented 1 year ago

If we carry out the work upgrading JSON Schema detailed in #469, we may also wish to consider a full audit of required fields in the schema.

lgs85 commented 1 year ago

The first task here is to document required fields here in a list, then we can make any easy/quick decisions about adding or removing required fields. Developing a comprehensive approach to this is a much bigger bit of work, and possibly out of scope for BODS v0.4.

rhiaro commented 1 year ago

Required fields:

Identifier / scheme OR schemeName

Annotation / statementPointerTarget
Annotation / motivation
Annotation / url (if `motivation` is `linking`)

Country / name

Jurisdiction / name

PublicationDetails / publicationDate
PublicationDetails / bodsVersion
PublicationDetails / publisher

Publisher / name OR url

PublicListing / securitiesListings
PublicListing / hasPublicListing

SecuritiesListing / stockExchangeJurisdiction
SecuritiesListing / security
SecuritiesListing / stockExchangeName

SecuritiesListing / security / ticker

EntityStatement / statementID
EntityStatement / statementType
EntityStatement / isComponent
EntityStatement / entityType
EntityStatement / publicationDetails
EntityStatement / unspecifiedEntityDetails / reason
EntityStatement / entitySubtype / generalCategory

OwnershipOrControlStatement / statementID
OwnershipOrControlStatement / statementType
OwnershipOrControlStatement / isComponent
OwnershipOrControlStatement / subject
OwnershipOrControlStatement / interestedParty
OwnershipOrControlStatement / publicationDetails
OwnershipOrControlStatement / subject / describedByEntityStatement

InterestedParty / unspecified / reason

PersonStatement / statementID
PersonStatement / statementType
PersonStatement / personType
PersonStatement / isComponent
PersonStatement / publicationDetails
PersonStatement / unspecifiedPersonDetails / reason
PersonStatement / politicalExposure / status
tiredpixel commented 11 months ago

BODS 0.2 and BODS 0.3 do not specify any required fields for Identifier. By my understanding of this, even "identifiers": [{}, {}, {}] would technically be valid.

More importantly, neither id nor uri is a required field. This means things like this are technically valid: "identifiers": [{"scheme": "USA-TAXID"}]. But these are not useful identifiers.

Importing these into Register results in incorrectly matched entities, since they are detected as having the same identifiers. A workaround has been put in place to drop all identifiers containing neither id nor uri. However, this is an area where Register is having to depart from the spec in order to make things work.

I propose that at least one of id or uri should be required.

I would actually suggest that one or both of id or uri should be required, with which one made as a choice in the spec. e.g. It would seem reasonable to require id for every identifier. But such a stronger proposal is not necessary in order to solve the problem.

This would not be solved by the suggestion above of either scheme or schemeName being required in BODS 0.4.

References https://github.com/openownership/register/issues/171 .

tiredpixel commented 11 months ago

On a related topic, the suggestion above is that scheme or schemeName become required fields. I support that proposal, but I would suggest it is made even stronger: that scheme become a required field for all identifiers.

Further to that, I would actually suggest that schemeName is dropped entirely, since it leads to a lot of extra space taken up in identifiers, in a brittle way. e.g. If scheme were set to XI-LEI and schemeName to Global Legal Entity Identifier Index (as is currently done in Register), it would not be possible to change schemeName to GLEIF or similar without having the republish every single statement.

I suggest it would be far more useful to maintain a list of recommended scheme values, or even to add these to the specification. Or potentially, scheme could be set by the specification itself, with the proviso that X- prefix could be used for publishing statements with custom identifiers or those which haven't been given an official scheme by BODS, yet. This would be similar to the approach taken with specified and custom HTTP headers.

Further to this idea, we could agree to adopt all identifiers listed on Org ID, for example. This would mean that scheme would be an ID given by that project, or alternatively a custom ID prefixed with X- or similar (presuming that such a prefix isn't used by Org ID). We are already doing something very similar in Register (e.g. XI-LEI, GB-COH)—but this would make it part of the spec and so more useful, and result in much better data consistency cross-datasource (and also, reduce the size of statements whilst not losing important information).

https://org-id.guide/

kathryn-ods commented 10 months ago

@tiredpixel I have made a separate ticket about scheme and schemeName so that doesn't get lost in this ticket - #542

kd-ods commented 9 months ago

I've added a 'required fields' tag to this issue and any other related issues.

kathryn-ods commented 2 months ago

728 relates to this