NCEAS / metadig-engine

MetaDig Engine: multi-dialect metadata assessment engine
7 stars 5 forks source link

support AI-readiness data checks in data quality engine #366

Open mbjones opened 1 year ago

mbjones commented 1 year ago

For the FAIR Data Quality check engine (issue #328), consult the ESIP AI-Readiness Checklist for a list of data quality checks that the community has felt are important for assessment for prep for ML tooling readiness. See:

I propose that these would be a good candidate set of checks that have already been vetted by ESIP and would be useful way to vet the data quality engine. Maybe it would be it's own suite?

mbjones commented 1 year ago

Also see the Analysis Ready Data (ARD) standards from CEOS:

There is a new OGC working group focused on ARD data standards: https://www.ogc.org/press-release/ogc-forms-new-analysis-ready-data-standards-working-group/

jeanetteclark commented 4 days ago

I reformatted the AI Readiness checklist into a csv with a column for whether the "check" could actually be implemented in an automated way. My values in that column is a best guess, first instincts kind of answer. Based on the list I identified the following checks that are already implemented:

The following checks could be easily implemented:

The rest either are not applicable, or would be difficult/impossible to implement.

ML-checks.csv