opencivicdata / ocd-division-ids

Open Civic Data Division IDs definition & canonical repository
Other
153 stars 92 forks source link

UK - Constituency changes for next HoC general election #385

Closed sguenther85 closed 2 weeks ago

sguenther85 commented 1 month ago

After the 2023 Periodic Review of Westminster constituencies the Parliamentary Constituencies Order 2023 the UK Gov introduced a lot of changes for the next general HoC election.

Including: 211 newly named constituencies (lost of redistricting) Many abolished constituency names Some disappearing and newly created seats

See Wiki for Summary and link below for law with table that lists new constituencies (pdf). https://en.wikipedia.org/wiki/2023_Periodic_Review_of_Westminster_constituencies#New_and_abolished_constituencies https://www.legislation.gov.uk/uksi/2023/1230/pdfs/uksi_20231230_en.pdf

More background: https://commonslibrary.parliament.uk/constituency-boundary-review-data-for-new-constituencies/

Readme will follow. Due to the recent unexpected election (by date), i wanted to share the constituency changes first

sguenther85 commented 1 month ago

Readme added

jloutsenhizer commented 1 month ago

I checked the count of OCD IDs after filtering out aliased IDs and districts which are being abolished and the total count yields 543 which maches the expected count after the 2023 redistricting.

sguenther85 commented 1 month ago

@jloutsenhizer Thanks for checking. i updated already the files. Please have a look for your requested changes.

jpmckinney commented 1 month ago

I think we typically set validThrough to be one date before the new validFrom, to avoid having both old and new districts being considered valid on that day.

sguenther85 commented 1 month ago

I think we typically set validThrough to be one date before the new validFrom, to avoid having both old and new districts being considered valid on that day.

@jpmckinney done

jpmckinney commented 1 month ago

I see a few dates like "The boundaries of X will..." "X will have its area reduced..." can we change these to pure dates? We can have an extra column if important to retain the sentences (not sure what to name the column).

sguenther85 commented 1 month ago

what would it look like if we changed it to pure dates as an example? maybe we should just leave the column empty in that case? i thought the official info would be helpful, but i didn't want to make it more complicated than it already is here in gb ;)

jpmckinney commented 1 month ago

You have some values like "2024-07-04: The boundaries of Berwick-upon-Tweed will change, and it will be renamed North Northumberland". I would just change them to "2024-07-04". We don't want the column empty if it is a new or abolished district.

sguenther85 commented 4 weeks ago

You have some values like "2024-07-04: The boundaries of Berwick-upon-Tweed will change, and it will be renamed North Northumberland". I would just change them to "2024-07-04". We don't want the column empty if it is a new or abolished district.

done

jpmckinney commented 4 weeks ago

Oh, my apologies, I misread the CSV and thought those values were in the validFrom/validThrough columns. They are fine in the sameAsNote column. You can reset to the previous commit and force push.

How are you deciding which are "sameAs" and which are "boundaries changed and division renamed"?

I think @chris48s @showerst @symroe had opinions on GB divisions when they were first added.

sguenther85 commented 4 weeks ago

done. reset to the previous version.

we have decided this on the basis of the "Civics Common Standard Data Specification" here

jpmckinney commented 4 weeks ago

Hmm, reading that, I still don't know how are you deciding which to create aliases for, and which to make invalid / create new. Can you demonstrate with one example?

sguenther85 commented 3 weeks ago

You will have always sameAs if the boundaries AND the name is changed.

Here is an overview over all current and new ocd's from our colleague with the official description of the changes Summary of how current constituencies will change, and their closest successors.xlsx

jpmckinney commented 3 weeks ago

Thanks! For the "Summary of change" column, where is that from? I checked the links in issue description, but it was not obvious.

sguenther85 commented 3 weeks ago

Sure: https://commonslibrary.parliament.uk/boundary-review-2023-which-seats-will-change/ -> btn: Full Scren Version -> Download all data (xlsx, 684KB)

And inside the .xlsx the 2nd tab

jpmckinney commented 3 weeks ago

Ok, so looks like the logic is:

Regexes used:

sguenther85 commented 3 weeks ago

@jpmckinney all done. We have now 157 validThrough, but 159 validFrom

jpmckinney commented 3 weeks ago

Is there any way we can track down the difference? We should have 650 valid divisions both before 2024-07-03 and after 2024-07-04 (not including sameAs).

Also, I intended only the punctuation in the "Ayr, Carrick and Cumnock" list to be updated. I only noted the others to assist me when comparing names before (which had more commas) and after (which had fewer).

jpmckinney commented 3 weeks ago

FWIW, this is why I write Python or Ruby scripts for Canada. If we can download the XLSX and then write code to update constituencies, then it's much easier to verify and make changes as needed. Right now, it's all quite hard to verify.

sguenther85 commented 2 weeks ago

@jpmckinney i made an update. I created now the new ocd file via script and now we have 157 validThrough and validFrom 54 sameAs

And in the end we have again 650 constituencies. So everthing looks fine now.

fyi: i found trough the script the two entries, whre walidFrom was set but there was also a sameAs reference for this entry.

sguenther85 commented 2 weeks ago

@jpmckinney friendly ping, as the election is not far away @jloutsenhizer @HKSenior

jpmckinney commented 2 weeks ago

Thank you! I typically commit the script as well under scripts/.

sguenther85 commented 2 weeks ago

@jpmckinney hehe, maybe next time. I wrote the script with js and html output and really dirty with the sources as .csv files.

But can anybody merge the pull request now?