NCEAS / z-test-issues

Test issue imports from redmine
0 stars 0 forks source link

Use datamanager for EML QA/QC #477

Open mbjones opened 7 years ago

mbjones commented 7 years ago

Author Name: ben leinfelder (ben leinfelder) Original Redmine Issue: 4393, https://projects.ecoinformatics.org/ecoinfo/issues/4393 Original Date: 2009-09-17 Original Assignee: ben leinfelder


As discussed at the LTER meeting this year:

Work Group: Metrics and reports for EML data package quality The EML data manager library (contributors: Costa, Tao, Leinfelder, Servilla) was created to parse EML metadata documents and insert the described data entity into a relational database. Our experience using the library with data packages contributed to the LTER NIS indicates that a large fraction do not have metadata of sufficient quality for the data to be used in this way. The primary contribution from LTER sites to the NIS is data sets, which are intended to be used in cross-site synthesis projects. Clearly, for cross-site synthesis to make use of the NIS a certain minimum level of metadata and data quality is required. The goals for this group:

  1. establish a set of metrics for LTER EML data package quality,
  2. recommend content for a report to be produced by the EML data manager library, and
  3. consider implementation strategies, e.g. should the report be another choice on the EML parser page? a shell script similar to that included with the EML parser?

The quality reports can be used to

  1. inform the dataset contributor about the content of the data package, and indicate whether data are of sufficient quality to be machine-readable. Our data catalog (metacat) has no quality standards beyond basic XML and EML compliance, so a data package that fails these quality metrics can still be uploaded or harvested, although its usefulness is limited.
  2. in the LTER context, reports can produce a list of failure modes for LTER metadata and data entities. Such a list could provide input for the design of specific tools for data providers, or help identify gaps in a site's IM system. A site requesting supplemental funding for its IMS could use the reports as part of the proposal justification.

As a starting point for our discussion, I have started a flowchart based on my own experience with the data manager library and SBC's EML data packages.

Here is the current membership (on this cc list, and present in Estes Park): Margaret O'Brien, SBC Emery Boose, HFR Dan Bahauddin, CDR James Brunt, LNO Mark Servilla, LNO Duane Costa, LNO Mark Shildhauer, NCEAS Ben Leinfelder, NCEAS

mbjones commented 7 years ago

Original Redmine Comment Author Name: Redmine Admin (Redmine Admin) Original Date: 2013-03-27T21:26:42Z


Original Bugzilla ID was 4393