open-sdg / sdg-build

Python package to convert SDG-related data and metadata between formats
MIT License
6 stars 22 forks source link

SDMX metadata output #261

Closed brockfanning closed 3 years ago

brockfanning commented 3 years ago

Fixes #232

This adds metadata to the existing SDMX output class. This is a quick-fix solution using a low-tech templating approach. Longer term it would be nice to get this functionality into the Python SDMX library, but this works for now. The metadata is placed in a 'meta' subfolder of the 'sdmx' folder.

New configuration

This adds some new parameters to the SdmxMlOutput class:

LucyGwilliamAdmin commented 3 years ago

@brockfanning is there a way to constrain to a specific schema e.g. MSD and not prose fields?

brockfanning commented 3 years ago

@LucyGwilliamAdmin Right now the constrain_meta parameter will limit to whatever is being used as the metadata schema. I guess we may need more control than that. For example, if the metadata schema includes "my_other_field", we may want a way to keep that field out of the SDMX output, right? I'm just wondering what would be a good parameter for that.

brockfanning commented 3 years ago

@LucyGwilliamAdmin I've added an msd parameter. If you pass in a path to an MSD, and also have constrain_meta set to true, then the metadata should be limited to whatever is in the MSD. If you have constrain_meta set to true, but you do not specify an MSD, it will assume the global MSD.

brockfanning commented 3 years ago

@LucyGwilliamAdmin This one also might be a good addition to 1.4.0 - so maybe we could merge and test during the beta testing?

LucyGwilliamAdmin commented 3 years ago

@brockfanning I'm just in the middle of giving this one a test - shouldn't be too long on it

LucyGwilliamAdmin commented 3 years ago

@brockfanning I'm getting a lot of warnings and then build fails: https://github.com/LucyGwilliamAdmin/nepal-data/runs/2804655196?check_suite_focus=true

brockfanning commented 3 years ago

Ok cool, I'll check it out now.

brockfanning commented 3 years ago

@LucyGwilliamAdmin Those issues were more about the Word input, but I think they are important issues so I threw fixes into this branch. Can you give it another try?

LucyGwilliamAdmin commented 3 years ago

@brockfanning not failing anymore but I'm not getting an SDMX folder in data build: https://lucygwilliamadmin.github.io/nepal-data/

brockfanning commented 3 years ago

@LucyGwilliamAdmin I think you may need to specify the DSD, like:

sdmx_output:
  dsd: CBS_DSD.xml
  constrain_meta: true
brockfanning commented 3 years ago

@LucyGwilliamAdmin Actually that works, but the metadata doesn't seem to be outputting right. Let me investigate a bit.

brockfanning commented 3 years ago

@LucyGwilliamAdmin I see the issue. It can't produce the metadata in SDMX unless it knows the ref area and reporting type. So try this config:

sdmx_output:
  dsd: CBS_DSD.xml
  constrain_meta: true
  meta_reporting_type: N
  meta_ref_area: NP
LucyGwilliamAdmin commented 3 years ago

@brockfanning I'm getting metadata xml output now - is there a way to validate?

brockfanning commented 3 years ago

@LucyGwilliamAdmin Not that I know of - given that the MSD has not been officially released yet I think this is pretty cutting-edge.

LucyGwilliamAdmin commented 3 years ago

Ok @brockfanning, since it's outputting are you happy for it to be merged at this point?

brockfanning commented 3 years ago

@LucyGwilliamAdmin Yep, I think it's good enough to merge at this point.