ONSdigital / SDG_11.2.1

Analysis for the UN Sustainable Development Goal 11.2.1
https://onsdigital.github.io/SDG_11.2.1/
Apache License 2.0
5 stars 7 forks source link

Put ages into buckets #7

Closed james-westwood closed 3 years ago

james-westwood commented 4 years ago

Check if functions exist to bucket-ize values If no function exists, write a function to bucketize the age values. Looking at other SDG indicators, buckets should be:

From Indicator 3.6.1

And from Indicator 3.4.2

These do not agree with each other. Need to ask.

jwestw commented 4 years ago

@jwestw to speak to SDG data team about which age brackets to use.

james-westwood commented 4 years ago

Decided to use 5 year age groups.

Explore other groupings to see if anything stands out as interesting.

james-westwood commented 3 years ago

I found some strange data after isolating the the counts of people (from census data) by age.It appeared because out of many thousands of values it is the only one with a comma in it and wouldn't convert to an integer.

The problem is in column 19 (for 19 years old), row 3091. Value is 1021.

image

The value is much higher than those around it and other typical values I've seen.

So I have forced it to convert, and have plotted the counts of the integer values.

image

And this is the plot, hovering over the problem value:

image

This outlier looks like it might be erroneous.

james-westwood commented 3 years ago

I tested the method that I created and am quite happy with it efficiency.

image

And dropping the original columns gets the dataframe I want.

image