openownership / lib-cove-bods

Check that your data complies with the Beneficial Ownership Data Standard (BODS) using our install our data review library to analyse files via your command line interface
https://datareview.openownership.org/
Other
1 stars 0 forks source link

Rework PEP property stats for 0.3 data #83

Closed kd-ods closed 4 months ago

kd-ods commented 2 years ago

In the Person Statement, hasPepStatus and pepDetails have been wrapped in a PoliticalExposure object and renamed status and details.

If there are existing lib-cove-bods tests relating to hasPepStatus and pepDetails, they may need to be updated.

odscjames commented 2 years ago

There are no checks but there is a statistic.

This is calculated on missingInfoReason field; that is still there so I guess we want this stat to keep working on 0.2 or 0.3 - it just has to check different field names

kd-ods commented 2 years ago

Thanks, @odscjames - sounds like both stats will need to be reworked for 0.3 data. I'll retitled this issue accordingly.

odscjames commented 2 years ago

Just noticed 0.1 has different field names again and we were just checking 0.2 field names so it won't currently work for 0.1 data!

Do we care about making it work for 0.1 data? I guess in the past we didn't.

In Cove, should we be hiding the stats for 0.1 data so as not to mislead? We currently don't show the stat's if there is 0, so we kinda are doing that.

odscjames commented 2 years ago

And in V0.3, if a person has politicalExposure/status==isNotPep we shouldn't be looking at politicalExposure/details at all?

(Should that be another check - can only have politicalExposure/details if isPep, or if isPep should have some details)

kd-ods commented 2 years ago

Do we care about making it work for 0.1 data?

I think that's a safe 'no'. Since no-one publishes to that afaik.

In Cove, should we be hiding the stats for 0.1 data so as not to mislead?

Yes. Again - seems safe.

Should that be another check...

Yes - I'll spin up another ticket, mirroring #81

kd-ods commented 2 years ago

Followed above comment by creating #87

odscjames commented 2 years ago

I've just fully realised the fact that the status can have an ‘unknown’ value.

1)

This makes the stat maybe not even make sense any more? It's currently defined as, as code variables:

And as strings in Cove:

Now it's not clear what "have" or "declared" means - is that just "isPep" or something else?

Where as for 0.3+ what you are probably interested in is:

Maybe we keep the old stat for 0.2 only or we just drop it?

We can probably define "declared" carefully so we can have a backwards compatibility stat, but maybe that's just confusing and we should be clear instead by making new stats?

2)

This points to additional checks we can do:

StephenAbbott commented 2 years ago

The Open Ownership Register is the only known publisher of BODS v0.1 data but we are in the process of moving to exporting to v0.2 so I agree with @kd-ods comments above that I'm not worried about making this work for 0.1 data and also fine to hide 0.1 data stats @odscjames

kd-ods commented 2 years ago

(1) How about for 0.2+

[I'm not sure how we present those breakdowns for other fields, but we should be consistent.]

(2) In the politicalExposure.status description we only use 'should': "An ‘unknown’ value means a PEP status declaration is expected but missing; the reason for the missing data SHOULD be supplied in the details array". So we only need this check:

If pep status != unknown there must NOT be details with a missingInfoReason.

odscjames commented 2 years ago

For the stat,

Person Statements / Total Statements ... where political exposure status is published. [i.e. has any value at all]

So in 0.3 we'll take it as any string value, and for 0.2 we'll take it as a True OR a False - basically, does the 'hasPepStatus' key exist.

Breakdown of political exposure statuses: isPep - x% / isNotPep - y% / unknown - z%

For 0.2 we'll map True == isPep & False == isNotPep

For the check,

If pep status != unknown there must NOT be details with a missingInfoReason.

Fine for 0.3

For 0.2, do we just ignore? Or can we say: if pep status == true there must NOT be details with a missingInfoReason.

kd-ods commented 2 years ago

That sounds good.

For 0.2, do we just ignore?

No, let's do the check, as you say: if pep status == true there must NOT be details with a missingInfoReason.

Blueskies00 commented 2 years ago

Tests https://docs.google.com/spreadsheets/d/14VRs8jmgtoHNLGEcEIqZyyso3aAueRvZRkUwkWN1oI0/edit#gid=1797299499

  1. PEP status = "" AND missingInfoReason completed. Expected result: Error - incorrect/missing pep status. Actual result: Validation check - "politicalExposure/status is missing but required. Check that the field is included and correctly spelled." Additional check - "This Person Statement has some PEP details but their PEP status has been declared as 'isNotPep'." Additional check - "This Person Statement has some PEP details with missing info but their status has not been declared as 'unknown'. " Comments: See https://github.com/openownership/lib-cove-bods/issues/87#issuecomment-1169659030 Test: FAILED ACTIONS: See https://github.com/openownership/lib-cove-bods/issues/87#issuecomment-1169659030
  2. PEP status = isNotPep AND missingInfoReason completed. Expected result: Error - incorrect pep status. Actual result: Additional check - " This Person Statement has some PEP details but their PEP status has been declared as 'isNotPep'." Additional check - "This Person Statement has some PEP details with missing info but their status has not been declared as 'unknown'. " Comments: As above, there is an overlapping check result going on here. Test: FAILED ACTIONS: As above.
  3. PEP status = isPep AND missingInfoReason completed. Expected result: Error - incorrect pep status. Actual result: Additional check - "This Person Statement has some PEP details with missing info but their status has not been declared as 'unknown'. " Comments: Test: PASSED ACTIONS: None.

Statistics: 8 statements 5 with PepStatus declared 3 isPep 1 isNotPep 1 unknown

Expected result: To see the above represented in the statistics, with associated percentages. Actual result: As expected. Comments: Test: PASSED ACTIONS: None.

Blueskies00 commented 2 years ago

Reran the failed tests above. Messages have been updated and overlapping errors corrected. All good!