Open StephenAbbott opened 1 year ago
I don't see any references to missingInfoReason
on current register master branch (7df086eff6e072a3294b97b2f9dc330c7905a4a3). In addition, the only references to missingInfoReason
I can find cross-repository is in register-sources-bods, where it is used for PepStatusDetails
type. Despite the rename missingInfoReason
to unspecifiedEntityDetails
and unspecifiedPersonDetails
in BODS 0.2, this does not appear to affect PEP status, so the current reference to missingInfoReason
would seem to be correct.
Additionally, unspecifiedEntityDetails
and unspecifiedPersonDetails
appear in register-sources-bods, register-transformer-psc, register-transformer-sk, register-transformer-dk. register is using a version of register-sources-bods which has this code.
@StephenAbbott, is this issue still current, or has it already been fixed?
Hmmm. This is one where we maybe need a little more checking/conversation before I can say whether it has been fixed or not.
There is no sign of the unspecifiedEntityDetails
and unspecifiedPersonDetails
fields in the BODS fields appearing on Datasette for the full Open Ownership Register dataset https://bods-data-datasette.openownership.org/register
If the issue had been fixed and changes implemented in the BODS mappers, then I would expect that we'd be seeing the data coming through.
Might it be the case that we've made changes but still need to do a full import to generate this data, @spacesnottabs?
I think this is missing the correct implementation in the PSC transformer:
Noting that no test cases were able to be found in the imported data. To verify this, the original PSC bulk data from 2023-11-01 was checked. No entity or person could be identified with unspecified details, however.
Additionally, the PSC bulk data link is currently not working, resulting in timeouts: https://download.companieshouse.gov.uk/en_pscdata.html This has been confirmed across multiple IPs and people.
Further work on this ticket is paused pending identification of test cases—i.e. entities and people for whom these unspecified details can be found in the raw data. Then, work will be needed to ensure those details get import into Elasticsearch raw indexes, and subsequently also to implement the missing parts to transform into BODS as noted above. Until these test cases are identified, however, nothing more can be done.
On hold pending @StephenAbbott being able to find examples
An issue regarding 'unknownPerson' in the register has been identified based on an external enquiry. The case involves the BODS JSON for Novo Nordisk, where there is an “Unknown Person(s)” personStatement with id 5845298339200587547-unknown
, listed as a UBO of Novo Nordisk. However, in the downloadable register, the 06-07 version does not include this entity.
Additionally, there are no other types of persons listed across all three registers in the OO register except for 'knownPerson' in the downloadable bulk data:
Kadie suspects that this issue is connected to the ongoing investigation and kindly suggested adding it here, as it might be the kind of example needed to test the fix.
In version 0.2 of BODS, the
missinginforeason
field from version 0.1 was renamed.The Register v2 code is still using
missinginforeason
rather thanunspecifiedEntityDetails
for entities orunspecifiedPersonDetails
for people https://github.com/search?q=repo%3Aopenownership%2Fregister-v2%20missinginforeason&type=codeI suspect this is why the unknown_person.unknown_reason is not working in the Register v2 prototype.
Hopefully making this change will allow Register v2 to generate the person statements for unknown persons, and to assign the relevant text to explain why the person is unknown: https://github.com/openownership/register/blob/bb03cbc850c478b95276c8c21fc8ebd524b8d4e4/app/service_objects/bods_mapper.rb#L189 https://github.com/openownership/register/blob/bb03cbc850c478b95276c8c21fc8ebd524b8d4e4/spec/shared_contexts/bods.rb#L482