CampaignLab / data-pipeline

Scripts and schemas that aim to make data from the inventory easier to analyse
8 stars 8 forks source link

Overcrowding - 146 #20

Open hannah-o-rourke opened 5 years ago

hannah-o-rourke commented 5 years ago

Key Question: Do rates of overcrowding affect vote share in wards?

Theory Are people more likely to vote Labour in areas with overcrowding? How strongly is this correlated?

Key Data: You can download overcrowding figures by ward from the 2011 census here - you just need to make sure the drop down menu on the bottom left under the download heading shows wards 2011 https://www.nomisweb.co.uk/census/2011/QS412EW

The measures show the number of unoccupied or over occupied rooms compared to the people living there i.e. if it says 0 rooms the property is fully occupied, if it is at -2 it is over occupied by 2, +1 it is under occupied by 1. You can work out the over occupancy rate of each ward by adding up the -1 and -2 columns and dividing it by the total.

Task Start to look for correlations between over occupancy and the 2018 election results.

chris48s commented 5 years ago

One of the problems you're generally going to hit with this project when you're trying to match 2018 election results to ward-level data is that the ward boundaries/codes which were in use at the time of publication for your data sources won't necessarily exactly correspond to the electoral geographies in use in 2018.

With some of the sources you're interested in, if an area has had boundary changes since the publication of your data source, you're probably just out of luck.

With 2011 Census data you do have the advantage that the data is published at lower levels of geography. However, note that you can't necessarily exactly express current ward boundaries as a union of 2011 Census output areas. To work an example through, lets say that we're interested in Addiscombe East ward in Croydon ( E05011462 ):

ward

This ward was defined in The London Borough of Croydon (Electoral Changes) Order 2017 ( http://www.legislation.gov.uk/uksi/2017/1125/contents/made ) so E05011462 will not appear in 2011 Census data.

If we plot that boundary on top of 2011 Census Ouput Areas (the most granular level of statistical geography)

oas

we can see that there are a number of boundaries which are partially inside the new ward and partially outside. I've highlighted a few for clarity:

oas2

Obviously the MVP here (and in all cases) is just to ignore all of the local authorities which have had boundary changes since 2011. If anyone is thinking of aggregating newer boundaries using OAs as building blocks, it is worth being aware of this constraint: it won't be possible to get an exact match in all cases.