Closed brockfanning closed 5 years ago
@LucyGwilliamAdmin You asked: what about indicators that don't have data? Similarly to having CSV as the input, would we need to SDMX-ML placeholders?
That's a good question and I'm not sure what the best way is. Placeholders may be a good place to start. I guess these would be metadata-only indicators? I'm actually not clear on how metadata should be imported with SDMX yet. Some things I'm wondering:
You can see here that the open-sdg output currently only provides minimum required metadata. But I imagine that if the goal is to manage everything in SDMX, we will need to figure out how to import metadata from SDMX too.
@brockfanning
So at the moment, there is no metadata in the Metadata tab panel? If so, is there any way we could have data coming from SDMX and metadata coming from .md?
Yes, without any non-SDMX sources of metadata, the metadata tabs are pretty bare right now - just the indicator id and the target id. We may be able to pull more metadata from the existing SDMX - that could be something to look into.
And yes, in theory it should be possible to combine SDMX data with YAML metadata. This object-oriented approach can send any number of "inputs" to a single "output". So there could be any number of SDMX inputs (like in this example) as well as any number of the YAML inputs (like in this example). All the inputs should be in a single list (like here), and then passed to the output (like here).
I'm trying to add metadata from md files in the meta folder so I have made changes to this file (line 64-69) but the metadata doesn't seem to be coming through to the feature branch (travis checks passed though)
Never mind, a couple of things were missing - metadata is now showing on test branch but I am getting an error in the console: Something to do with data not being in CSV, maybe?
Guessing it's something to do with metadata and data reacting together, as data isn't showing now that metadata is showing?
@brockfanning any idea?
I now have SDMX-ML files for all indicators (reported and placeholders) as well as a metadata file for each indicator on this branch but now data or metadata isn't showing on feature branch. Still getting error mentioned above. Also getting lots of validation errors in the travis build.
Actually metadata is showing
Let me give this a try locally and see if anything jumps out at me. More soon.
@LucyGwilliamAdmin Locally I also got some metadata validation errors. Before diving into the SDMX stuff, let's resolve those errors.
First I saw a whole bunch of these:
Validation errors for indicator [some indicator id]
None is not of type 'string', 'integer'
It would be great if the errors displayed the field name (possible future improvement?) but I figured out that these are in reference to the data_keywords
key. Many indicators have nothing there, and instead need to have at least empty quotes.
Next I saw this:
Validation errors for indicator 16-1-1
True is not of type 'string', 'integer'
datetime.datetime(2019, 3, 15, 0, 0) is not of type 'string', 'integer'
This pointed out some problems in 16-1-1's indicator_name
and graph_title
fields.
Last I saw this:
Validation errors for indicator 9-c-1
'Y' is not of type 'boolean'
This pointed out some problems in 9-c-1's data_show_map
field.
After fixing these issues, validation passes again. I didn't go further than that though, so I'm not sure if that all helps with the SDMX problem.
Note about validation: The new object-oriented approach to validation (what you're using here) is totally different from the old sdg-build validation. The new approach uses "JSONSchema" validation. So it's not surprising that the UK metadata is suddenly not passing validation: it has never been run through this JSONSchema validation before now.
Forgot to mention, I put up a PR with my fixes to those metadata issues here.
Ok great thanks, I've merged that PR and I'm no longer getting the validation errors. Still not sure what's causing this error though:
What does this function do?
That open-sdg code needs some commenting for sure - but my guess is that it converts the JSON produced by sdg-build into a format more directly usable by open-sdg.
Should it work in exactly the same way for CSV and SDMX files?
To be clear, that code is from open-sdg, and open-sdg is only seeing the output of sdg-build. The output of sdg-build is always the same, regardless of whether the input with CSV or SDMX. So yes, as long as sdg-build is doing its job, that open-sdg code should not notice any difference between CSV or SDMX data sources.
I think the next thing to look at is why sdg-build is not generating data: https://sdg.mango-solutions.com/data/sdmx/comb/1-1-1.json
This may be where the actual bug lies. I'll try to look at it when able, but I think that's the next hurdle.
Would it be something to do with the metadata file as until I added the metadata files to the input, the data was showing?
@brockfanning are there any updates on this?
@LucyGwilliamAdmin Actually I did fix a bug about a week ago. Can you give it a try with version 0.4.1 of sdg-build?
Yeah, will do that now
@brockfanning great, this worked!
@brockfanning do you think there is a better format to have this, which will make it easier for countries to maintain?
@LucyGwilliamAdmin It would depend on the team involved. I've found that probably the simplest format would be CSV, since it can be edited with Excel. JSON can be confusing because of the extra syntax (braces, quotes, etc.). YAML might be better than JSON, but would need to be edited in a text editor (ie, could not be edited in Excel).
Yes, I was thinking CSV - originally the mapping was in CSV format but when I tried to convert this to JSON in the script there was some issues with commas (see here)
@LucyGwilliamAdmin I'll go ahead close this, but please re-open or start a new issue if there was anything left to cover.
Starting this issue to continue the conversation from #37