NASA-PDS / registry

PDS Registry provides service and software application necessary for tracking, searching, auditing, locating, and maintaining artifacts within the system. These artifacts can range from data files and label files, schemas, dictionary definitions for objects and elements, services, etc.
https://nasa-pds.github.io/registry
Apache License 2.0
2 stars 2 forks source link

As a registry user, I want to ingest supplemental metadata from external data sources #49

Open jordanpadams opened 3 years ago

jordanpadams commented 3 years ago

Motivation

...so that I can ingest supplemental metadata into the registry that is both archival (Product_Metadata_Supplemental) and non-archival (e.g. external databases) to make the information searchable and accessible.

Additional Details

Note: This ticket is the parent epic for the overarching design consideration, there are several sub-tickets for the individual data source implementations.

Some use cases we need to think about:

  1. Use the PDS4 product Product_Metadata_Supplemental to revise data in products in the archive. NOTE: Need to check with RMS whether or not these should be considered archival.
  2. Non-archival metadata - this is "augmented" or "supplemental" metadata that is generated after archiving that can help for search and/or improves the existing archive metadata. e.g. image tags indicating a crater exists in the image, or more detailed target information like Mars - Olympus Mons, or tracking service information like "here is this products DOI that was never registered)" a. should we just treat these the same as Product_Metadata_Supplemental? (depends if those are considered archival or not)
  3. PDS3 databases - how do we integrate with databases that contain PDS3 data? a. should we just treat these the same as Product_Metadata_Supplemental? (depends if those are considered archival or not)

Acceptance Criteria

See sub-tickets for individual data source acceptance criteria

Engineering Details

Design thoughts

Should we just use Product_Metadata_Supplemental as the model for this and then augment for the non-archival data sources? - it seems like we should require some sort of archive-worthy definition of the data to be ingested. this could almost be a one-to-one translation of Product_Metadata_Supplemental, but just make it non-archival.

For the external databases, and initial workaround, we may also be just using a CSV for the time being: NASA-PDS/registry#51, and implement a MySQL plugin down the road since it would require some additional data modeling and configuration to connect to the database.

jordanpadams commented 3 years ago

completed NASA-PDS/registry#121. moving the handling of the rest of these use cases to next build.