alan-turing-institute / QUIPP-collab

Collaboration on the QUIPP project
1 stars 1 forks source link

UK census data - teaching file #139

Open gmingas opened 4 years ago

gmingas commented 4 years ago

This is a small non-disclosive sample of the census publicly available by ONS here. This was designed for teaching purposes. It contains 18 census characteristics like sex, age, region, ethinic group, religion, etc for 1% for the census population (~570,000 individuals). Personal identifiers (name, address, data of birth) have been removed. Potentially disclosive variables (e.g. geographic information) have been either completely removed or have been aggregated. They come under an Open Government License (OGL), requiring the inclusion of source accreditation when reproducing the data: link. This is now part of the QUIPP pipeline here. Note that a 5% sample of the census data is also available by the UK data service - see #57.

For microsimulation synthesis we can combine this with aggregated UK census data publicly available by the UK Data Service here. We are using a particular dataset containing numbers of males and females per region in England (only 9 rows). We might need to use other versions of this dataset including different variables (e.g. religion, age) in the same aggregated format. These come under an OGL and also with a EULA by UK Data Service which does not permit to attempt to identify individuals, households or organisations: link. This is also now part of the QUIPP pipeline here.