GSS-Cogs / family-towns-and-high-streets

0 stars 0 forks source link

SG-Scottish-Index-of-Multiple-Deprivation-2020 #7

Open ajtucker opened 4 years ago

LPerryman commented 3 years ago

GutHub https://github.com/GSS-Cogs/family-towns-and-high-streets/tree/master/datasets/SG-Scottish-Index-of-Multiple-Deprivation-2020 Jenkins https://ci.floop.org.uk/job/GSS_data/job/towns-high-streets/job/SG-Scottish-Index-of-Multiple-Deprivation-2020/

josepajay commented 3 years ago

There was no scrapper implemented for dataset Scottish Index of multiple deprivation 2020 and the data was spread over 2 different files. So I have used dataURL in info.json to fetch one file and downloaded the other file directly using requests library and loaded it using loadxlstabs.

LPerryman commented 3 years ago

Sorry Grace, was discussed in dataviz meeting as possibly useful for Ahmed to use so have done the stage 2 spec!

LPerryman commented 3 years ago

Ranks are published on PMD4

LPerryman commented 3 years ago

I've managed to get 2 datasets with different Units from one pipeline onto PMD4 by cheating! So, two datasets have been output from the pipeline: Ranks and Indicators It has 2 spreadsheets from 2 different URLs. I process the first one, then within the script open the info.json, change the dataURL and Unit values, save and then process the second one. The second dataset (Indicators) has multiple Measure Types so i have cheated by adding an 'Indicator Type' column to hold the measure types. I thought i could get away with this as the dataset is still measuring 'Deprivation' through 'Indicators'

PMD4: Scottish Index of Multiple Deprivation - Indicators. (223232 rows) https://staging.gss-data.org.uk/cube/explore?uri=http%3A%2F%2Fgss-data.org.uk%2Fdata%2Fgss_data%2Ftowns-high-streets%2Fsg-scottish-index-of-multiple-deprivation-2020%2Findicators-catalog-entry

Scottish Index of Multiple Deprivation - Ranks (55808 rows) https://staging.gss-data.org.uk/cube/explore?uri=http%3A%2F%2Fgss-data.org.uk%2Fdata%2Fgss_data%2Ftowns-high-streets%2Fsg-scottish-index-of-multiple-deprivation-2020%2Franks-catalog-entry

Also, i've given up trying to run stage 1 transforms that use databaker and have a lot of data as their run time on my MAC is stupid! instead i just use the sheet_name parameter in as_pandas (fanks Alex): tab = scraper.distributions[0].as_pandas(sheet_name='sheet name') As the data is a simple table i just create multiple tables based on the main dimensions (Data Zone) and each value dimension (Income, Employment, Education, etc.) and then concat them, creating 'Deprivation Rank/Indicator' column based on the column heading names: Table 1: Data Zone ( column A), Deprivation Rank, Overall rank (column F) Table 2: Data Zone (column A), Deprivation Rank, Income domain rank (column G) Table 3: Data Zone (column A), Deprivation Rank, Employment domain rank (column H) and so on and on and on. all this takes SECONDS to run.

and the datasets have their own dataset URI: http://gss-data.org.uk/data/gss_data/towns-high-streets/sg-scottish-index-of-multiple-deprivation-2020/ranks http://gss-data.org.uk/data/gss_data/towns-high-streets/sg-scottish-index-of-multiple-deprivation-2020/indicators but the codelists URIs are defined within the main URI, and referenced within info.json: http://gss-data.org.uk/data/gss_data/towns-high-streets/sg-scottish-index-of-multiple-deprivation-2020 so the dimension filter does not work in PMD4

I have also added Attributes (Total Population, Working Age Population) and attempted to define them in the info.json but they are not showing in PMD4.

Tracey-B commented 3 years ago

@LPerryman BA Comments: Given the normal generic caveats these both look good to go.

RedWalters commented 3 years ago

Links to PMD4: https://staging.gss-data.org.uk/catalog-entry/find-submodule?uri=http%3A%2F%2Fgss-data.org.uk%2Fdata%2Fgss_data%2Ftowns-high-streets%2Fsg-scottish-index-of-multiple-deprivation-2020%2Findicators-catalog-entry https://staging.gss-data.org.uk/catalog-entry/find-submodule?uri=http%3A%2F%2Fgss-data.org.uk%2Fdata%2Fgss_data%2Ftowns-high-streets%2Fsg-scottish-index-of-multiple-deprivation-2020%2Franks-catalog-entry