open-sdg / sdg-build

Python package to convert SDG-related data and metadata between formats
MIT License
5 stars 23 forks source link

Dealing with blank cells - SDMX output #225

Closed LucyGwilliamAdmin closed 3 years ago

LucyGwilliamAdmin commented 3 years ago

@brockfanning If some rows in a column have a value, then blank cells in the same column aren't becoming _T.

For example

Year Sex Value
2016 20
2016 M 25
2016 F 22
2017 21
2017 M 22
2017 F 21

This leads to 3 SDMX series:

<Series FREQ="A" REPORTING_TYPE="N" SERIES="_T" AGE="_T" INCOME_WEALTH_QUANTILE="_T" EDUCATION_LEV="_T" OCCUPATION="_T" CUST_BREAKDOWN="_T" COMPOSITE_BREAKDOWN="_T" DISABILITY_STATUS="_T" ACTIVITY="_T" PRODUCT="_T" UNIT_MEASURE="PT">
  <Obs TIME_DETAIL="2016" OBS_VALUE="20" TIME_PERIOD="2016"/>
  <Obs TIME_DETAIL="2017" OBS_VALUE="21" TIME_PERIOD="2017"/>
<Series FREQ="A" REPORTING_TYPE="N" SERIES="_T" SEX="M" AGE="_T" INCOME_WEALTH_QUANTILE="_T" EDUCATION_LEV="_T" OCCUPATION="_T" CUST_BREAKDOWN="_T" COMPOSITE_BREAKDOWN="_T" DISABILITY_STATUS="_T" ACTIVITY="_T" PRODUCT="_T" UNIT_MEASURE="PT">
  <Obs TIME_DETAIL="2016" OBS_VALUE="25" TIME_PERIOD="2016"/>
  <Obs TIME_DETAIL="2017" OBS_VALUE="22" TIME_PERIOD="2017"/>
<Series FREQ="A" REPORTING_TYPE="N" SERIES="_T" SEX="F" AGE="_T" INCOME_WEALTH_QUANTILE="_T" EDUCATION_LEV="_T" OCCUPATION="_T" CUST_BREAKDOWN="_T" COMPOSITE_BREAKDOWN="_T" DISABILITY_STATUS="_T" ACTIVITY="_T" PRODUCT="_T" UNIT_MEASURE="PT">
  <Obs TIME_DETAIL="2016" OBS_VALUE="22" TIME_PERIOD="2016"/>
  <Obs TIME_DETAIL="2017" OBS_VALUE="21" TIME_PERIOD="2017"/>

Notice that SEX dimension is completely missing from the first series instead of having `SEX="_T"

Also here's kazstat example: https://kazstat.github.io/sdg-data-kazstat/sdmx/1-1-1.xml

brockfanning commented 3 years ago

@LucyGwilliamAdmin This seems like it might be fixed by a PR I have open: #217

Could you give that a try?

LucyGwilliamAdmin commented 3 years ago

Closing this as was resolved by #217 which has now been merged