openownership / lib-cove-bods

Check that your data complies with the Beneficial Ownership Data Standard (BODS) using our install our data review library to analyse files via your command line interface
https://datareview.openownership.org/
Other
1 stars 0 forks source link

Additional check: market identifier codes (MICs) #80

Closed kd-ods closed 1 month ago

kd-ods commented 2 years ago

See https://standard.openownership.org/en/master/schema/guidance/identifiers.html#market-identifier-codes-mics

The checks would be:

If one of the properties marketIdentifierCode (MIC) or operatingMarketIdentifierCode (operating MIC) has a non-empty string value, then:

odscjames commented 2 years ago

MIC's seem to be published regularly so the tool should try to download and cache them for a bit, rather than hardcoding a version into the lib.

Just to be clear: https://www.iso20022.org/market-identifier-codes -> "MIC list by country (.csv)": "marketIdentifierCode" should be from column C and the operatingMarketIdentifierCode should be from column D. And it's valid that they could both be the same value (eg see row 92 )

On handling the temporal aspect: All data will be checked against the latest version of MIC's for now.

This may cause problems if a MIC is withdrawn, and old data that was previously valid is now flagged as invalid.

I can't see a way around that apart from starting our own MIC list somewhere with history. (We could set up a GitHub repository with a GitHub Action that regularly downloaded the CSV and committed it in some form to the repository.)

odscjames commented 2 years ago

On handling the temporal aspect: All data will be checked against the latest version of MIC's for now.

In call: Yes!

Set up a Git repo as a low priority, so we have information for a future decision on how much these change.

kd-ods commented 2 years ago

Just to be clear: https://www.iso20022.org/market-identifier-codes -> "MIC list by country (.csv)": "marketIdentifierCode" should be from column C and the operatingMarketIdentifierCode should be from column D

That's right. And looking at the Excel sheet, @Blueskies00 noticed that the column headers are not visible because they are in white(!).

And it's valid that they could both be the same value (eg see row 92 )

Yes - that's valid.

On handling the temporal aspect..

Had you seen, @odscjames , that the Excel sheet contains a 'MICs List Deactivated MICs' sheet with a full list of previous MICs. Could we please do the validation check against 'MICs List Deactivated MICs' appended to 'MICs List by Country'? (There's also a recent 'MICs Modifications' sheet, but we're not so concerned about that.)

odscjames commented 2 years ago

Had you seen, @odscjames , that the Excel sheet contains a 'MICs List Deactivated MICs' sheet with a full list of previous MICs. Could we please do the validation check against 'MICs List Deactivated MICs' appended to 'MICs List by Country'? (There's also a recent 'MICs Modifications' sheet, but we're not so concerned about that.)

Had not seen.

Ok, but reading Excel is a pain, so in that case, I am thinking of setting up the history repository first. That repository can get the Excel and write CSV's to the repo. Then this lib can just grab a CSV direct from Github.

Also I have more confidence Github raw feature won't start blocking us - https://www.iso20022.org might

kd-ods commented 2 years ago

Great. Thanks!

odscjames commented 2 years ago

https://github.com/openownership/ISO10383 set up. I realised I was looking at the old format, and we have to make sure we look at the new format as the old one stops in 31 October 2022.

odscjames commented 2 years ago

BTW it looks like we have the data to check stockExchangeJurisdiction if MICs is set?

kd-ods commented 2 years ago

it looks like we have the data to check stockExchangeJurisdiction if MICs is set

Can you add another issue for that, @odscjames ? It's a relatively low-priority check to we'll put it at the bottom of the backlog.

Blueskies00 commented 2 years ago

Note: Due to licensing concerns, currently only the following check has been put in place, and subsequently tested:

If one of the properties marketIdentifierCode (MIC) or operatingMarketIdentifierCode (operating MIC) has a non-empty string value, then: the other property should also have a non-empty string value. (Error message: "You have supplied a value for [marketIdentifierCode/operatingMarketIdentifierCode] so a value for [operatingMarketIdentifierCode/marketIdentifierCode] should also be provided")

Tests https://docs.google.com/spreadsheets/d/17_lF1ctm8QrTWxm8tEel4jelQ5fCnZM_PE7uZzrg3cs/edit#gid=353511506

  1. Entry with valid MIC pair. Expected result: No errors. Actual result: No errors. Comments: Test: PASSED ACTIONS: None.
  2. Entry with MIC missing AND Operating MIC present. Expect result: Error - missing MIC. Actual result: Additional check - This Entity Statement has a security listing where operatingMarketIdentifierCode is set but marketIdentifierCode is not set." Comments: Test: PASSED ACTIONS: None.
  3. Entry with MIC present AND Operating MIC missing. Expect result: Error - missing MIC Actual result: Additiona check - "This Entity Statement has a security listing where marketIdentifierCode is set but operatingMarketIdentifierCode is not set." Comments: Test: PASSED ACTIONS: None.
  4. Entry with invalid MIC pair. Expect result: No error. Actual result: No errors. Comments: No error is expected since the check against the ISO list has been removed. However included in here for posterity. Test: PASSED ACTIONS: None.
  5. Entry with no MIC codes. Expect result: No error. Actual result: No errors. Comments: MIC codes aren't a required field. Test: PASSED ACTIONS: None.