openownership / register

A demonstration transnational register of beneficial ownership data from the UK, Denmark, Slovakia and Armenia
https://register.openownership.org
GNU Affero General Public License v3.0
18 stars 3 forks source link

Update approach to using OpenCorporates identifiers for sub-national jurisdictions #121

Open StephenAbbott opened 1 year ago

StephenAbbott commented 1 year ago

Inspired by this Twitter thread, I found myself searching for a number of Scottish Qualifying Partnerships on the Open Ownership Register. This took me to the following search results page where we realised that the duplicate entities are not being resolved due to an OpenCorporates issue.

@spacesnottabs investigated further and discovered that Open Corporates has the company under jurisdiction ca_pe for "Prince Edward Island (Canada)" but the Register is parsing the jurisdiction as ca (Canada). If we try to resolve the record with ca as the jurisdiction code, it will find nothing.

Sample PSC record:

{  "company_number": "SG000612",
  "data": {                                                                     
    "address": {                                                                
      "address_line_1": "Grafton Street",                                       
      "country": "Canada",                                                      
      "locality": "Charlottestown",                                             
      "premises": "65",                                                         
      "region": "Prince Edward Island C1a8b9"                                   
    },                                                                          
    "etag": "510f53dafafaf4acf43a16964418a2cf8ccc9a3e",                         
    "identification": {                                                         
      "country_registered": "Canada",                                           
      "legal_authority": "Canada",                                              
      "legal_form": "Private Company",
      "place_registered": "Pei Business/Corporate Registry",
      "registration_number": "13174"
    },
    "kind": "corporate-entity-person-with-significant-control",
    "links": {
      "self": "/company/SG000612/persons-with-significant-control/corporate-entity/RnA_vTfWVHeC1PJqQqRw8LZuFoU"
    },
    "name": "Integritas (Canada) Trustee Corporation",
    "natures_of_control": [
      "right-to-appoint-and-remove-person"
    ],
    "notified_on": "2017-06-26"
  }
}

Our sample Entity stored in Mongo: #<Entity _id: 630e81eab19f5888b5a78d34, updated_at: 2022-08-30 21:32:26.818 UTC, type: "legal-entity", name: "Integritas (Canada) Trustee Corporation", address: "65, Grafton Street, Charlottestown, Prince Edward Island C1a8b9", nationality: nil, country_of_residence: nil, dob: nil, jurisdiction_code: "ca", company_number: "13174", incorporation_date: nil, dissolution_date: nil, company_type: nil, restricted_for_marketing: nil, lang_code: nil, identifiers: [{"document_id"=>"GB PSC Snapshot", "link"=>"/company/SG000612/persons-with-significant-control/corporate-entity/RnA_vTfWVHeC1PJqQqRw8LZuFoU", "company_number"=>"13174"}], merged_entities_count: nil, master_entity_id: nil, oc_updated_at: nil, last_resolved_at: nil, self_updated_at: 2022-08-30 21:32:26.818 UTC, _type: "Entity">

Currently "region" is not used in the code at all, and only country is used. This is fine for our gb, dk, sk jurisdictions, but doesn't work for overseas such as Canada.

We need to extend our support to use both region and country to get the jurisdiction name/code by upgrading the countries gem we already use to the latest version (and fix the breaking changes): https://github.com/countries/countries

The work involved will make sure we can find it even if the name isn't an exact match. Working theory would be the jurisdiction code is {country-code}_{region-code} but this needs to be checked against the gem and org-id.guide approach: https://org-id.guide/results?structure=all&coverage=CA&sector=all

kd-ods commented 1 year ago

Working theory would be the jurisdiction code is {country-code}_{region-code}

Jurisdiction codes are always likely to have those two parts, but I think any ingester looking at a jurisdiction code field will need to parse various separators. It looks like both OpenCoroporates and org-id use the underscore separator. But subdivision codes in ISO 3166-2 start with the country code and then append a two letter region code, with a hyphen separator. For example. 'CA-AB' for Alberta, Canada. All Canada's subdivision codes are here.

We say in BODS that Jurisdiction.code is "The 2-letter country code (ISO 3166-1) or the subdivision code (ISO 3166-2) for the jurisdiction". See schema ref So the mongo jurisdiction_code should probably do likewise.

kd-ods commented 1 year ago

If this service (https://beta.canadasbusinessregistries.ca/about) improved to include more Canadian registers and to have an API, then it could be useful for resolving ca codes. Unfortunately, it's rather lacking at the moment.