Leeds-MRG / Minos

SIPHER Microsimulation for estimating the effect on Income policy on mental health.
MIT License
4 stars 3 forks source link

Add regional information to the data #7

Closed ld-archer closed 2 years ago

ld-archer commented 2 years ago

Need to add spatial information in the form of the Government Office Region (GOR) to the data, using this variable from Understanding Society.

Steps:

RobertClay commented 2 years ago

Update RateTables in Daeadalus to take Regionals rather than LADs. Can repurporse nolocation scripts.

RobertClay commented 2 years ago

region variable (gor_dv) is split into two variables for some reason (gor_dv_x, gor_dv_y). This isnt in any of the documentation so who knows what this means.

RobertClay commented 2 years ago

also they seem to be exactly equal to each other..

ld-archer commented 2 years ago

I think there's a GOR variable in both indresp and hhresp so that's probably the issue. Can safely ignore the hhresp I reckon.

RobertClay commented 2 years ago

yeah those suffixes indicate its due to a pandas merge. worth looking into dealing with clashes.

ld-archer commented 2 years ago

We can just remove that var as part of the combine function (and any others that clash in future). I'm happy to do it or tell me if you are so we don't muck it up

RobertClay commented 2 years ago

https://stackoverflow.com/questions/62779180/pandas-merge-unexpectedly-produces-suffixes

RobertClay commented 2 years ago

I've added this in to prefer the indresp variables.

ld-archer commented 2 years ago

Looks great!

RobertClay commented 2 years ago

I hate LAD codes. Between 2019 and daedalus rate tables there are several different regions. Some of this may just be wierd naming errors. I'll need to update the JSONs manually I think.

{'Stevenage', 'St Albans', 'Welwyn Hatfield', 'City of London+Westminster', 'Northumberland UA', 'Cornwall+Isles of Scilly', 'Gateshead', 'East Hertfordshire'} {'E07000104', 'E07000101', 'E07000097', 'E06000048', 'E09000001+E09000033', 'E06000052+E06000053', 'E07000100', 'E08000020'}

Regional data for BHPS was easy though so thats nice.

RobertClay commented 2 years ago

https://github.com/tomalrussell/uklad-changes

some hero made this though

RobertClay commented 2 years ago

Also {'Eilean Siar', 'Dumfries & Galloway', 'Derry City and Strabane', 'Perth & Kinross', 'Armagh City, Banbridge and Craigavon', 'Argyll & Bute', 'Edinburgh, City of', 'Ards and North Down'} {'S12000036', 'S12000035', 'N09000011', 'S12000024', 'S12000013', 'S12000006', 'N09000002', 'N09000005'}

I've added all these to downloaded ONS dictionaries. Its sloppy but cant think of a better way without redoing daedalus rate tables from scratch.

RobertClay commented 2 years ago

done. needs reviewing as theres a good chunk of updates.

ld-archer commented 2 years ago

Closed in PR #13