NASA-IMPACT / pyQuARC

The pyQuARC tool reads and evaluates metadata records with a focus on the consistency and robustness of the metadata. pyQuARC flags opportunities to improve or add to contextual metadata information in order to help the user connect to relevant data products. pyQuARC also ensures that information common to both the data product and the file-level metadata are consistent and compatible. pyQuARC frees up human evaluators to make more sophisticated assessments such as whether an abstract accurately describes the data and provides the correct contextual information. The base pyQuARC package assesses descriptive metadata used to catalog Earth observation data products and files. As open source software, pyQuARC can be adapted and customized by data providers to allow for quality checks that evolve with their needs, including checking metadata not covered in base package.
Apache License 2.0
19 stars 0 forks source link

ECHO10 URL Type Modifications #274

Open jenny-m-wood opened 4 months ago

jenny-m-wood commented 4 months ago

Describe the bug ECHO10 only has a single URL Type field (not a Content Type, Type, and Subtype field). Providers sometimes provide a combination of the Content Type, Type, and Subtype in the single URL Type field, but pyQuARC does not currently support this.

To Reproduce Steps to reproduce the behavior:

  1. Open the test echo10 collection metadata file in pyQuARC
  2. Enter a value of "VIEW RELATED INFORMATION : GENERAL DOCUMENTATION" or "PublicationURL : VIEW RELATED INFORMATION" in the Online Resources/OnlineResource/Type field
  3. See GCMD inconsistency errors

Expected behavior No error is given for echo10 records that provide a URL type that is a combination of the Content Type and Type or the combination of Type and Subtype separated by ":", " : ", ": ", " :" as long as the combination is in the correct GCMD hierarchy.

Outputs Acceptable values (i.e. no GCMD error outputs) for echo10 would include:

"Type" "Subtype" "Type : Subtype" "Type: Subtype" "Type :Subtype" "Type:Subtype" "Content Type : Type" "Content Type :Type" "Content Type: Type" "Content Type:Type"

Unacceptable values (i.e. GCMD error outputs) for echo10 would include, but are not limited to: