Open StephenAbbott opened 1 year ago
Working theory would be the jurisdiction code is {country-code}_{region-code}
Jurisdiction codes are always likely to have those two parts, but I think any ingester looking at a jurisdiction code field will need to parse various separators. It looks like both OpenCoroporates and org-id use the underscore separator. But subdivision codes in ISO 3166-2 start with the country code and then append a two letter region code, with a hyphen separator. For example. 'CA-AB' for Alberta, Canada. All Canada's subdivision codes are here.
We say in BODS that Jurisdiction.code
is "The 2-letter country code (ISO 3166-1) or the subdivision code (ISO 3166-2) for the jurisdiction". See schema ref So the mongo jurisdiction_code should probably do likewise.
If this service (https://beta.canadasbusinessregistries.ca/about) improved to include more Canadian registers and to have an API, then it could be useful for resolving ca codes. Unfortunately, it's rather lacking at the moment.
Inspired by this Twitter thread, I found myself searching for a number of Scottish Qualifying Partnerships on the Open Ownership Register. This took me to the following search results page where we realised that the duplicate entities are not being resolved due to an OpenCorporates issue.
@spacesnottabs investigated further and discovered that Open Corporates has the company under jurisdiction ca_pe for "Prince Edward Island (Canada)" but the Register is parsing the jurisdiction as ca (Canada). If we try to resolve the record with ca as the jurisdiction code, it will find nothing.
Sample PSC record:
Our sample Entity stored in Mongo:
#<Entity _id: 630e81eab19f5888b5a78d34, updated_at: 2022-08-30 21:32:26.818 UTC, type: "legal-entity", name: "Integritas (Canada) Trustee Corporation", address: "65, Grafton Street, Charlottestown, Prince Edward Island C1a8b9", nationality: nil, country_of_residence: nil, dob: nil, jurisdiction_code: "ca", company_number: "13174", incorporation_date: nil, dissolution_date: nil, company_type: nil, restricted_for_marketing: nil, lang_code: nil, identifiers: [{"document_id"=>"GB PSC Snapshot", "link"=>"/company/SG000612/persons-with-significant-control/corporate-entity/RnA_vTfWVHeC1PJqQqRw8LZuFoU", "company_number"=>"13174"}], merged_entities_count: nil, master_entity_id: nil, oc_updated_at: nil, last_resolved_at: nil, self_updated_at: 2022-08-30 21:32:26.818 UTC, _type: "Entity">
Currently "region" is not used in the code at all, and only country is used. This is fine for our gb, dk, sk jurisdictions, but doesn't work for overseas such as Canada.
We need to extend our support to use both region and country to get the jurisdiction name/code by upgrading the countries gem we already use to the latest version (and fix the breaking changes): https://github.com/countries/countries
The work involved will make sure we can find it even if the name isn't an exact match. Working theory would be the jurisdiction code is {country-code}_{region-code} but this needs to be checked against the gem and org-id.guide approach: https://org-id.guide/results?structure=all&coverage=CA§or=all