ONSdigital / sdg-SDMX-data-qualifier

MIT License
1 stars 0 forks source link

map other_info col to True/False #3

Closed jwestw closed 3 years ago

jwestw commented 3 years ago

Time estimate: 1 hour.

(This could be done in 15 mins, but it depends how uniform the values are in the column)

jwestw commented 3 years ago

There are 97 different values in the other_info column

jwestw commented 3 years ago

Hey Lucy (it won't let me tag you on this new repo)

I have a couple of problematic entries:

df.loc['8-1-1'].other_info
'<p>Data going back to 1956 are available from the UK Economic Accounts (see Sources tab).<p> Data follows the UN specification for this indicator, with the exception that  values have not been converted to US dollars. This indicator is being used as an approximation of the UN SDG Indicator. Where possible, we will work to identify or develop UK data to meet the global indicator specification. This indicator has been identified in collaboration with topic experts.'
df.loc['6-2-1'].other_info
'A framework for measuring faecal waste flows and safety factors has been developed and piloted in 12 countries (World Bank Water and Sanitation Program, 2014), and is being adopted and scaled up within the sanitation sector. This framework has served as the basis for indicators 6.2.1 and 6.3.1. Data on safe disposal and treatment are not available for all countries. However, sufficient data were available to make global and regional estimates of safely managed sanitation services in 2017. Presence of a handwashing station with soap and water does not guarantee that household members consistently wash hands at key times, but has been accepted as the most suitable proxy. Data were available for 70 countries in 2017. At present, UK data does not account for homeless rough sleepers. Data follows the UN specification for this indicator. This indicator has been identified in collaboration with topic experts.'

These seem to contain contradictory statments, i.e. "has been accepted as the most suitable proxy" and "Data follows the UN specification for this indicator"

There only seems to be 2 like this though.

Are they official or proxy?

LucyGwilliamAdmin commented 3 years ago

Hi @jwestw

I think 6.2.1 is a definite proxy in terms of our use and it wouldn't be able to be reported to SDG Lab

8.1.1 - as long as the only thing that isn't making it a proxy is the fact that currency hasn't been coverted, we should be able to take that forward. As a side note, if this is true, then I think the text should be updated for clarification. Could you add that to a list of things to raise with data team please? or just tag one of them here even?

jwestw commented 3 years ago

This has taken much longer than the hour I expected, mainly because of the quality checks. Took more like 3 hours.

The ticket is complete though. https://github.com/ONSdigital/sdg-SDMX-data-qualifier/commit/564ccedd1ff01edc9d8b68dc1e34e090a265f2c4

And also includes a quality check. We need to decide what to do with the contradictions. Can include/exclude them.