BODS 0.2 and BODS 0.3 specify that any of id, scheme, schemeName, uri are required. This is unfortunate, since it's possible for identifiers to have schemeName but no scheme, leading to lots of matching text strings, rather than a persistent identifier.
e.g. LEIs have schemeXI-LEI, and schemeNameGlobal Legal Entity Identifier Index. This makes it possible to match on scheme = 'XI-LEI.
e.g. OpenCorporates identifiers have schemeNameOpenCorporates, but no scheme. This means it's necessary to always match on the name itself.
e.g. OpenOwnership Register identifiers have schemeNameOpenOwnership Register, but no scheme. This means it's necessary to always match on the name itself.
However, even for LEIs, schemeName is still stored within the statement itself, meaning simply rewording the name Global Legal Entity Identifier Index (e.g. to LEI or GLEIF Identifier, etc.) would cause a cascade of having to republish every single statement, since the hash of the statement and therefore the statementID would change (correct behaviour considering that schemeName is considered to be within the statement).
Make scheme required, leaving schemeName optional.
Remove schemeName entirely, making scheme required.
From my current understanding, I would prefer 2, since it would avoid a statement republishing storm.
In addition, I propose that:
Make either id or uri required (as in a choice in the specification, not using 'any' language that currently exists).
I don't currently have an opinion on which of id or uri would be better.
NOTE: This ticket is almost certainly filed in the wrong place—but I'm not sure where the right place is, given multiple repositories and different ticket groupings splitting BODS specification, proposals, implementations, and such. Please feel free to move it to the correct place. Thank you!
Thanks @tiredpixel. I've noted these suggestions and stored them alongside others - see discussion here for example - relating to required fields in BODS which we'll revisit at a later stage (not in version 0.4).
BODS 0.2 and BODS 0.3 specify that any of
id
,scheme
,schemeName
,uri
are required. This is unfortunate, since it's possible for identifiers to haveschemeName
but noscheme
, leading to lots of matching text strings, rather than a persistent identifier.e.g. LEIs have
scheme
XI-LEI
, andschemeName
Global Legal Entity Identifier Index
. This makes it possible to match onscheme = 'XI-LEI
.e.g. OpenCorporates identifiers have
schemeName
OpenCorporates
, but noscheme
. This means it's necessary to always match on the name itself.e.g. OpenOwnership Register identifiers have
schemeName
OpenOwnership Register
, but noscheme
. This means it's necessary to always match on the name itself.However, even for LEIs,
schemeName
is still stored within the statement itself, meaning simply rewording the nameGlobal Legal Entity Identifier Index
(e.g. toLEI
orGLEIF Identifier
, etc.) would cause a cascade of having to republish every single statement, since the hash of the statement and therefore thestatementID
would change (correct behaviour considering thatschemeName
is considered to be within the statement).This commit shows some of the text matching going on in Register, made necessary by this specification: https://github.com/openownership/register-sources-bods/commit/9df343855d103ac472e9e9c59642580d5ef489fc
I propose one of two changes to BODS:
scheme
required, leavingschemeName
optional.schemeName
entirely, makingscheme
required.From my current understanding, I would prefer 2, since it would avoid a statement republishing storm.
In addition, I propose that:
id
oruri
required (as in a choice in the specification, not using 'any' language that currently exists).I don't currently have an opinion on which of
id
oruri
would be better.NOTE: This ticket is almost certainly filed in the wrong place—but I'm not sure where the right place is, given multiple repositories and different ticket groupings splitting BODS specification, proposals, implementations, and such. Please feel free to move it to the correct place. Thank you!