healthysustainablecities / global-indicators

An open-source tool for calculating and reporting spatial indicators for healthy, sustainable cities worldwide using open or custom data.
MIT License
86 stars 35 forks source link

Should output explicit metadata and data dictionary when generating resources #230

Closed carlhiggs closed 1 year ago

carlhiggs commented 1 year ago

Currently, the 'generate resources' step focuses on generating a report, but more generally it represents output of a range of resources --- Data files (actually exported at the end of neighbourhood analysis currently: a geopackage of indicators and related data, CSV files with grid and region summaries), Maps and figures (used in the report), the report, and later also validation reports and scorecard infographics.

However, there isn't explicit metadata or data dictionary generated to accompany these. We save text file copies of the parameters used in analysis (and retain dated copies if these change through course of analysis to help track provenance), and the running log of messages generated through analysis, but not an xml metadata file (or equivalent; although the parameters contain the information required to create this) or a CSV/xlsx data dictionary.

These are important resources to support dissemination and usability of the generated resources, and should be relatively straightforward to implement for the main outputs.

carlhiggs commented 1 year ago

This seems like a good option to test out for generating ISO 19115 compatible metadata using our region-specific parsed configuration parameters: https://github.com/antarctica/metadata-library

Produced by a verified account for British Antarctic Survey, and contains a good example that could be pretty easily adapted for our usecase, assuming the package works in the context of our Docker imgae

carlhiggs commented 1 year ago

Re the above (bas-metadata-library), I tried a few implementations but couldn't get Conda within the Docker image to successfully install this package using pip as per the directions. Also, I noticed that the latest ISO 19115 standard supported was 2009 but the current one is 2019. If I could get things installed I wouldn't have worried, but given its a struggle to get things working, I'll look for another solution.

carlhiggs commented 1 year ago

pygeometa may be a better bet: https://github.com/geopython/pygeometa This appears to be currently under development, installs correctly and has some guidance, including on nesting data records: https://geopython.github.io/pygeometa https://geopython.github.io/pygeometa/reference/mcf/#nesting-mcfs

carlhiggs commented 1 year ago

I just installed and confirmed the above works to output a yml configuration file formatted according to the mcf specification to a requested metadata standard --- so now its just a quesiton of implementing it.

i think the process could be: