opencivicdata / ocd-division-ids

Open Civic Data Division IDs definition & canonical repository
Other
153 stars 92 forks source link

Incorrect Country Code used for United Kingdom #184

Open jloutsenhizer opened 4 years ago

jloutsenhizer commented 4 years ago

156 added OCD IDs for The United Kingdom but used uk for the country code rather than the "gb" from ISO-3166. #181 introduces the parliament constituencies using teh correct gb country code.

You can see #181 for some of the previous discussion. I'm proposing we use the OCD IDs from #181 and alias the IDs from #156 to these new IDs (possibly via a corrections file).

Tagging people likely interested in this change, please voice concerns if this might break how you use these identifiers. @chris48s @showerst @symroe @sguenther85 @jpmckinney

chris48s commented 4 years ago

Thanks for tagging me in on this.

The identifiers should definitely be migrated from :uk to :gb. I'm totally on-board with this. This is an unfortunate mistake on my part which was not caught in review. From our perspective, it is fine for the move to be framed as corrections as opposed to aliases. I'm also happy to submit a PR migrating the code for generating the sub-national division identifiers (although I will probably defer this until after polling day for the national elections).


Having looked at the IDs, its unfortunate that we've applied the slugging rules to slightly different source data (unclear what the source data is for #181 ) which has lead to a couple of inconsistencies in the id slugs:

ocd-division/country:gb/part:wls/ed:ynys_môn ocd-division/country:uk/pcon:ynys_mon

ocd-division/country:gb/part:eng/region:uke/ed:richmond_yorks ocd-division/country:uk/pcon:richmond_~yorks~

..but this is not a big deal.


Unfortunately I was unable to comment before the PR was merged, but I think this next point is a big deal: I fully agree with the comments on incorrect hierarchy in https://github.com/opencivicdata/ocd-division-ids/pull/181#issuecomment-549548955 . Nesting UK Parliamentary Constituencies implies heirarchy that doesn't exist for this division type. This creates confusion when the same divison names are used for the national (UK) parliament and sub-national (Scotland, Wales, NI) parliaments/assemblies. For example, ocd-division/country:gb/part:wls/ed:cardiff_south_and_penarth implies the scope of the identifier is Wales, when in fact its a division of a body with UK-wide scope. In the existing :uk IDs, we disambiguate that using the division type e.g:

The really important thing to encode in the identifier is what type of division it is, or what body officials are elected to, as opposed to just what part of the country the division is in and that it is an "electoral division". Unfortunately the IDs as merged don't achieve that.

I'm not dogmatic about how it is solved, although I would like to lean on existing prefixes published by an official body, as opposed to rolling our own. For example, I'd be equally happy to see (and help implement) any of the following ways of going about this disambiguation:

No hierarchy, using the Office for National Statistics prefixes (pcon, nawc, etc)

Hierarchical, using the Office for National Statistics prefixes (pcon, nawc, etc)

No hierarchy, using the Ordnance Survey prefixes (wmc, wac, etc)

Hierarchical, using the Ordnance Survey prefixes (wmc, wac, etc)

(Although note that the hierarchy isn't strictly necessary given the names/slugs will be unique within those prefixes), but the IDs as merged do present a problem that needs solving.

jpmckinney commented 4 years ago

Since #181 was only merged a few hours ago, and since everyone involved is tagged here, I am happy to consider changes, including re-alignment with the pattern used for the uk prefixes.

I think a process step we should have completed was to be sure to tag and invite all participants in #156 to comment on #181 – but we can still fix that now.

chris48s commented 4 years ago

Hi @jpmckinney

Thanks for showing flexibility on this.

If we accept that the changes in #181 aren't set in stone, I've submitted another PR over at #186 with a proposed migration for moving the full suite of existing identifiers from country:uk to country:gb, updating the build scripts, docs and the identifiers themselves.

This would be the least disruptive approach to moving the existing identifiers.

While they're in flux, it would be reasonable to review in the context of the points raised above re: slugs and nesting.

With respect to introducing hierarchy, its not required as slugs are necessarily unique within each legislature (although as noted, not across legislatures) but if you want to introduce it, now's the time (with the caveat that any hierarchy that is introduced should reflect the actual hierarchies of electoral organisations/geography in-use on the ground).

jloutsenhizer commented 4 years ago

Thank you for the detailed response @chirs48s,

I believe the hierarchy in #181 was coming from the way the various boundary commissions define the UK parliament constituencies. Although, I can see how it would make sense to have the hierarchy reflect the scope of the government they are defined for (all UK parliament constituencies being in country:gb and then all Assembly of Wales constituencies being in country:gb/part:wls).

I haven't looked too much into the Assembly of Wales, so please correct me if I'm wrong, but it looks to me that the Assembly of Wales constituencies are identical to the UK Parliament constituencies, so in this case of identical names they would correspond to the same place by the way they are defined. There's a desire to have some consistency in the type names (see #170) which is why I had requested "ed" type in the PR. Perhaps in this case it would make sense to create aliases with different types for the different parliaments that use the same constituencies?

I do acknowledge that this might not always be the case, so it might still make sense to have these more specific types.

I think @sguenther85 should be able to fill in some of the details and will have a more informed opinion about the hierarchy of these OCD IDs and the type names when it comes to different levels of UK government.

chris48s commented 4 years ago

I can see how it would make sense to have the hierarchy reflect the scope of the government they are defined for

Exactly. Also, bear in mind if we were to add OCD IDs for the divisions used in local government elections (the county, district and borough councils) which we don't have at the moment but may need in future, that's where you really need hierarchy both for name disambiguation and to describe the relationship between the divisions and organisations they're children of.


I haven't looked too much into the Assembly of Wales, so please correct me if I'm wrong, but it looks to me that the Assembly of Wales constituencies are identical to the UK Parliament constituencies, so in this case of identical names they would correspond to the same place by the way they are defined. There's a desire to have some consistency in the type names (see #170) which is why I had requested "ed" type in the PR. Perhaps in this case it would make sense to create aliases with different types for the different parliaments that use the same constituencies?

I do acknowledge that this might not always be the case, so it might still make sense to have these more specific types.

In general, it is misleading to assume correlation between boundaries used by the devolved parliaments and assemblies and those used by the UK Parliament even though they sometimes exist. For the purposes of these examples, I'm going to use the OCD IDs I've proposed in #186


Finally note that both Welsh Assembly and Scottish Parliament use Additional Member System/Mixed-Member Proportional Representation elections, so there are 2 sets of boundaries in use - Constituencies (for the members elected by First Past The Post) and Regions (for the members elected by Closed-List Proportional Representation). As such, its important to encode in the identifiers both the legislature (e.g: Scottish Parliament) and the type of division (e.g: Region), which the existing prefixes also achieve. Its also in line with the comments in https://github.com/opencivicdata/ocd-division-ids/issues/170#issuecomment-549191172 - thanks for linking to that thread :+1:

jpmckinney commented 4 years ago

@sguenther85 Please contribute to this issue, so that we can restore a consistent approach to UK identifiers.

sguenther85 commented 4 years ago

@jpmckinney @chris48s Hey sorry. You're right. I should contribute to this issue. Since snap ellection is announced for UK i have a lot to do. But I will write something on this issue by tomorrow at the latest. I just have to read the whole conversation before i write something ;)

jloutsenhizer commented 4 years ago

Thanks for the detailed explanations @chris48s I think I'm convinced for the need to have distinct types depending on the government the districts are for makes sense, and actually looking a the us OCD IDs, this is already being done for different levels of government in the United States.

You also have a good point about the possible future shifts in the regions of the identifiers.

I still think there's some value in having the hierarchy around the UK parliament constituencies and if you have the distinct types they won't collide with the local government OCD IDs, but I'd be interested to here @sguenther85, point of view on this.

sguenther85 commented 4 years ago

So. Sorry for the long delay.

My idea was it to represent the parliamentary constituencies of UK and not of Wales, Scotland or Northern Ireland Countries-Parliaments. All 650 constituencies are constituencies of the country UK and a couple of constituencies are in different countries (where we use /part: instead of country) of UK.

If we want furthermore represent in the same ocd country file the constituencies of other parliaments then we have to think about how is the best way. A better way, if this would be possible is, to seperate the constituencies of other parliaments of other countries like Wales, Scotland and Northern Ireland in extra country.csv's (Like "country-gb_wls.csv") or something like that.

Or we have to go a way like chris discribed with doubled constituencies for different parliaments (from different countries)

The names of the types itself "nawc" or "pcon" are unfortunately inexpressively from my view that you need a glossary to understand what each type means ;)

Also to work with both ways and alises could be a way to represent all.

As I said. separating the constituencies of different parliaments from countries into different files would be best from my point of view (if that's possible), otherwise you have to move to a solution like chris described it.

Because strictly speaking we would have the same problem for many countries in the EU like France with the island like Guadeloupe, La Réunion, Martinique...., they all have constituencies for France and for their own parliament.

Nevertheless i would merge it as mentioned in the first post only after the next election from 12th Decembre 2019

jpmckinney commented 4 years ago

All identifiers belong to the same namespace (ocd-division). They are only separated into files for the convenience of maintaining an otherwise very large number of identifiers. Users of identifiers should not extract semantics from filenames; the organization of the files should be treated as meaningless when interpreting identifiers, as it is not part of the spec.

As for the types (ed, nawc, pcon, etc.): In many cases, a type needs a glossary to correctly interpret what the type is intended to represent, and that's not a major issue. I suppose that #156 could have chosen more human-readable types, but now that we have them, I think re-use of existing types is the better option.

Is there any issue with merging #186? This issue was opened the same day that #181 was merged, so we should be able to resolve this quickly, to avoid there being bad identifiers in the repo.

sguenther85 commented 4 years ago

no issue, I'm fine with it.

chris48s commented 4 years ago

The names of the types itself "nawc" or "pcon" are unfortunately inexpressively from my view that you need a glossary to understand what each type means

:+1: This is a good point about terminology. The abbreviations like spr, cauth etc are pretty meaningless in isolation, especially to an international audience. Each type is presented in a different file, but I'd be happy to add an additional file meta-data file defining the terms. Is that a reasonable thing to include in this repo? Is there an existing format or provision in the spec for this?

As I said. separating the constituencies of different parliaments from countries into different files would be best from my point of view

As with a number of other countries, the divisions are presented in different files but also compiled into a single file with all ID types for the country merged: https://github.com/chris48s/ocd-division-ids/tree/issue184/identifiers/country-gb Although the spec doesn't explicitly infer meaning to splitting the data into multiple files, from a purely pragmatic perspective, if you just care about the UK Parliament Consitituencies, there is a single file that contains only UK Parliament Consitituencies: https://github.com/chris48s/ocd-division-ids/blob/issue184/identifiers/country-gb/uk_parliament_consitituencies.csv (and so on for the other division types)

strictly speaking we would have the same problem for many countries in the EU like France with the island like Guadeloupe, La Réunion, Martinique...., they all have constituencies for France and for their own parliament

Getting a bit off-topic, but this seems like a review point before #187 is merged. In general it would be an undesirable quality to define identifiers for national government divisions in such a way that it makes it difficult to define identifiers for regional or local government divisions at a later time.

jloutsenhizer commented 4 years ago

For merging #186,

I'm onboard with the use of different types for the different districts depending on the government they are for.

I think at some point in the future it'd be useful to have somewhere in the repo a file that annotates these types with information so that we know pcon, ed, nawc, are all types of electoral districts. I haven't given this much thought though and this would be a potential solution to #170.

My only concern is around the hierarchy of the IDs, which I think should be added back in. Other OCD IDs do not limit hierarchy on electoral districts at the level of the government they are defined for. There are several examples of this in the repository currently:

I think it'd be useful to have these as a part of the GB OCD IDs as well since the boundary commission of each part is in charge with defining their own districts, This would give us: ocd-division/country:gb/part:wls/pcon:cardiff_south_and_penarth ocd-division/country:gb/part:wls/nawc:cardiff_south_and_penarth

Which makes it easy to determine both of these districts are within Wales which is in the United Kingdom, and with the specific types used we can also know that they are for different governments.

chris48s commented 4 years ago

I think at some point in the future it'd be useful to have somewhere in the repo a file that annotates these types with information so that we know pcon, ed, nawc, are all types of electoral districts.

This does already exist in a README for this identifier set: https://github.com/opencivicdata/ocd-division-ids/blob/master/scripts/country-uk/README.md .. but what I was getting at above was that if there is a defined meta-data format for this, that would also be useful to formally define these.

My only concern is around the hierarchy of the IDs, which I think should be added back in

I've opened PR #188 which adds a few commits on top of #186 to add a constituent nation clause to each identifier. I'll leave it to the committers to decide whether to merge #186 or #188 . I don't have a strong preference either way for this specific issue. If you prefer #188 I can tidy the history.

jpmckinney commented 4 years ago

@jloutsenhizer Would you like to review and approve your preferred PR, between #186 and #188? I'm ambivalent.

jdmgoogle commented 4 years ago

Weighing in with my $0.02...

I definitely prefer the style in PR #188, with the /part:XYZ included in the hierarchy for the non-UK-Parliament constituencies.

For the UK Parliament constituencies, though: which governmental body is in charge of drawing the boundaries and organizing the elections for those seats? In the US the congressional districts hang off the states because each state gets to draw the boundaries of those districts and (largely) controls how elections for those seats are managed. How is it done in the UK? E.g., is Wales told "you get 40 seats to the House of Commons, you decide the rest?" or is there some body at the UK level which draws the boundaries and oversees the elections?

chris48s commented 4 years ago

@jdmgoogle This lumps together a variety of questions, all of which have different answers:

which governmental body is in charge of drawing the boundaries

For parliamentary boundaries, the Boundary Commission for England, Boundary Commission for Wales, Boundary Commission for Scotland and Boundary Commission for Northern Ireland will all perform consultation, have input into the process and submit recommendations, but the process to turn those recommendations into (draft) legislation and propose it to Parliament is carried out by central Government and the process of debating and finally ratifying that legislation is carried out by the UK Parliament itself (basically the same as with any other piece of legislation passed through Parliament). It is not necessarily the case that the final bill passed will exactly correspond 100% with the original recommendations of the Boundary Commissions (although they might be involved in the process of amending the bill - we're stretching my knowledge at this point). Fundamentally though, the thing that defines the boundaries is a piece of legislation passed through the UK-wide Parliament.

and organizing the elections for those seats?

In terms of organising elections, its a bit of a random mashup:

and oversees the elections?

At least this one is simple: All UK elections are regulated by The Electoral Commission: https://www.electoralcommission.org.uk/ which is a UK-wide body overseeing electoral administration and political finance.