open-sdg / sdg-build

Python package to convert SDG-related data and metadata between formats
MIT License
6 stars 22 forks source link

SDMX Config File #287

Closed otis-bath closed 2 years ago

otis-bath commented 2 years ago

When making changes to a feature branch (feature-sdmx-test) to get the UK platform able to output SDMX data, the changes suggested to the config file are generating an error with the series. https://github.com/ONSdigital/sdg-data/runs/4396797341?check_suite_focus=true

otis-bath commented 2 years ago

Hi @brockfanning, just checking up on this one - not sure how demanding the changes required will be but looking at getting this into 1.7.0 if possible? Do you think this is realistic, wanting to finalise the release items for 1.7.0 next week. Thanks

brockfanning commented 2 years ago

@otis-bath The reason this is happening is that indicator 12.4.2 has both a "Units" column and a "Unit measure" column. The sdg-build is automatically converting "Units" to "UNIT_MEASURE" for the SDMX output, and then your custom mappings are also converting "Unit measure" to "UNIT_MEASURE". The end result is two columns with the same name, which causes that unhelpful error.

I think the fix is to automatically drop any existing columns that your custom mappings would duplicate. So, in the example described above, the "Units" column would be dropped (for the SDMX output). I'll work on a PR for that.

EDIT: And after fixing that, another issue presented itself - when there is a code-mapping for a column that contains "NaN" (empty values). I've attempted fixes for both issues in the same PR linked below.

brockfanning commented 2 years ago

@otis-bath #288 is ready for testing.