Open lgs85 opened 1 year ago
The first task here is to document required fields here in a list, then we can make any easy/quick decisions about adding or removing required fields. Developing a comprehensive approach to this is a much bigger bit of work, and possibly out of scope for BODS v0.4.
Required fields:
Identifier / scheme OR schemeName
Annotation / statementPointerTarget
Annotation / motivation
Annotation / url (if `motivation` is `linking`)
Country / name
Jurisdiction / name
PublicationDetails / publicationDate
PublicationDetails / bodsVersion
PublicationDetails / publisher
Publisher / name OR url
PublicListing / securitiesListings
PublicListing / hasPublicListing
SecuritiesListing / stockExchangeJurisdiction
SecuritiesListing / security
SecuritiesListing / stockExchangeName
SecuritiesListing / security / ticker
EntityStatement / statementID
EntityStatement / statementType
EntityStatement / isComponent
EntityStatement / entityType
EntityStatement / publicationDetails
EntityStatement / unspecifiedEntityDetails / reason
EntityStatement / entitySubtype / generalCategory
OwnershipOrControlStatement / statementID
OwnershipOrControlStatement / statementType
OwnershipOrControlStatement / isComponent
OwnershipOrControlStatement / subject
OwnershipOrControlStatement / interestedParty
OwnershipOrControlStatement / publicationDetails
OwnershipOrControlStatement / subject / describedByEntityStatement
InterestedParty / unspecified / reason
PersonStatement / statementID
PersonStatement / statementType
PersonStatement / personType
PersonStatement / isComponent
PersonStatement / publicationDetails
PersonStatement / unspecifiedPersonDetails / reason
PersonStatement / politicalExposure / status
BODS 0.2 and BODS 0.3 do not specify any required fields for Identifier
. By my understanding of this, even "identifiers": [{}, {}, {}]
would technically be valid.
More importantly, neither id
nor uri
is a required field. This means things like this are technically valid: "identifiers": [{"scheme": "USA-TAXID"}]
. But these are not useful identifiers.
Importing these into Register results in incorrectly matched entities, since they are detected as having the same identifiers. A workaround has been put in place to drop all identifiers containing neither id
nor uri
. However, this is an area where Register is having to depart from the spec in order to make things work.
I propose that at least one of id
or uri
should be required.
I would actually suggest that one or both of id
or uri
should be required, with which one made as a choice in the spec. e.g. It would seem reasonable to require id
for every identifier. But such a stronger proposal is not necessary in order to solve the problem.
This would not be solved by the suggestion above of either scheme
or schemeName
being required in BODS 0.4.
References https://github.com/openownership/register/issues/171 .
On a related topic, the suggestion above is that scheme
or schemeName
become required fields. I support that proposal, but I would suggest it is made even stronger: that scheme
become a required field for all identifiers.
Further to that, I would actually suggest that schemeName
is dropped entirely, since it leads to a lot of extra space taken up in identifiers, in a brittle way. e.g. If scheme
were set to XI-LEI
and schemeName
to Global Legal Entity Identifier Index
(as is currently done in Register), it would not be possible to change schemeName
to GLEIF
or similar without having the republish every single statement.
I suggest it would be far more useful to maintain a list of recommended scheme
values, or even to add these to the specification. Or potentially, scheme
could be set by the specification itself, with the proviso that X-
prefix could be used for publishing statements with custom identifiers or those which haven't been given an official scheme by BODS, yet. This would be similar to the approach taken with specified and custom HTTP headers.
Further to this idea, we could agree to adopt all identifiers listed on Org ID, for example. This would mean that scheme
would be an ID given by that project, or alternatively a custom ID prefixed with X-
or similar (presuming that such a prefix isn't used by Org ID). We are already doing something very similar in Register (e.g. XI-LEI
, GB-COH
)—but this would make it part of the spec and so more useful, and result in much better data consistency cross-datasource (and also, reduce the size of statements whilst not losing important information).
@tiredpixel I have made a separate ticket about scheme and schemeName so that doesn't get lost in this ticket - #542
I've added a 'required fields' tag to this issue and any other related issues.
If we carry out the work upgrading JSON Schema detailed in #469, we may also wish to consider a full audit of required fields in the schema.