Public-Health-Scotland / source-linkage-files

This repo is for the syntax used for the PHS Source Linkage File project
https://public-health-scotland.github.io/source-linkage-files/
Other
4 stars 2 forks source link

Sort out the Postcode lookup / matching - variables with issues include `hbrescode` (HB2018), `datazone` and `hscp` #738

Closed Jennit07 closed 1 year ago

Jennit07 commented 1 year ago

I've sorted hscp on the branch 738-sort-pc-lookup and i think hbrescode was done in #724 for the individual file. For datazone, i think this is an extra variable because it is taken from BOXI. We take datazone2011 from the postcode lookup so i think we can drop datazone from this altogether.

In the fill_geographies function we list datazone within the check_variables_exist but we don't actually use this anywhere in the script from what i can see. @Moohan Can you think of any reasons why we would need datazone before i remove this? Thanks

Moohan commented 1 year ago

In the fill_geographies function we list datazone within the check_variables_exist but we don't actually use this anywhere in the script from what i can see. @Moohan Can you think of any reasons why we would need datazone before i remove this?

I think we used to take datazone i.e. the datazone from the data mart which is attached to the episode, and if the postcode couldn't match a new datazone through the lookup (either postcode is invalid or missing), we'd use the original one. - The only situation this actually helps is when the postcode is missing/invalid but datazone isn't though, and I guess that's a very niche situation...

As long as we have the datazone2011 correctly named and no odd datazone variable left over that should be fine!

Jennit07 commented 1 year ago

As discussed on a call, i have renamed datazone in the datamart and made use of this in the fill_geographies function. PR open #744