NASA-PDS / validate

Validates PDS4 product labels, data and PDS3 Volumes
https://nasa-pds.github.io/validate/
Apache License 2.0
16 stars 11 forks source link

As a user, I want validate to throw an error when a collection inventory contains an invalid secondary product reference #462

Closed smclaughlin7 closed 1 year ago

smclaughlin7 commented 2 years ago

Validate should throw an error if it detects an invalid LID specified for a secondary context in a collection inventory.

πŸ’ͺ Motivation

An incorrect LID for secondary context product would lead to a referential integrity issue within the parent collection on the PDS side.

πŸ“– Additional Details

The NSSDCA's ingest process for PDS4 deliveries detected several secondary context products where incorrect LIDs are specified in collection inventories for several bundles/SIPs. This is not a problem for the NSSDCA because we do not archive secondary products. (Pds-deep-archive does not include them in the SIP manifest.) However we/NSSDCA do track secondary products (LIDs) listed in collection inventories because the deep archive should eventually receive those products as primaries from PDS.

The attached spreahdsheet, PDS4MissingMembershipReport20211201_SecondaryIssuesForATM.xlsx, provides examples of incorrect context product LIDs that NSSDCA ingest recently detected for several ATM bundles. The incorrect LIDs are highlighted in red in column B. Column F specifies the correct LIDs based on EN's master context repository at https://pds.nasa.gov/data/pds4/context-pds4/. The other secondary context LIDs (black font) captured in column B are simply NSSDCA remarks and are not relevant to this issue.

βš–οΈ Acceptance Criteria

tbd

βš™οΈ Engineering Details

al-niessner commented 1 year ago

@jordanpadams @nutjob4life @tloubrieu-jpl

Moving onto next. Looking at the spreadsheet, all of these would be caught by #415. Did you want to catch it sooner? If so, it would require either:

  1. validate needs access to registry
  2. all references must be added with containers even it is adding an old version that already exists in the registry
jordanpadams commented 1 year ago

copy. thanks @al-niessner added this to #600 so we can track this to closure with that PR

miguelp1986 commented 1 year ago

@jordanpadams could you offer some guidance on testing this? I've ran the first item in the spreadsheet and didn't get any errors:

validate -t . --rule pds4.collection                                                                             ─╯

PDS Validate Tool Report

Configuration:
   Version                       3.3.0-SNAPSHOT
   Date                          2023-08-14T22:15:15Z

Parameters:
   Targets                       [file:/Users/MPena/Documents/PDS/validate_test_files/462/]
   Rule Type                     pds4.collection
   Severity Level                WARNING
   Recurse Directories           true
   File Filters Used             [*.xml, *.XML]
   Data Content Validation       on
   Product Level Validation      on
   Allow Unlabeled Files         false
   Max Errors                    100000
   Registered Contexts File      /usr/local/validate-3.3.0-SNAPSHOT/resources/registered_context_products.json

Product Level Validation Results

  PASS: file:/Users/MPena/Documents/PDS/validate_test_files/462/collection_context_saturn_thermosphere_h2_density_temp.xml
        1 product validation(s) completed

PDS4 Collection Level Validation Results

  PASS: file:/Users/MPena/Documents/PDS/validate_test_files/462/collection_context_saturn_thermosphere_h2_density_temp.xml
        1 integrity check(s) completed

Summary:

  0 error(s)
  0 warning(s)

  Product Validation Summary:
    1          product(s) passed
    0          product(s) failed
    0          product(s) skipped

  Referential Integrity Check Summary:
    1          check(s) passed
    0          check(s) failed
    0          check(s) skipped

End of Report
Completed execution in 2870 ms
jordanpadams commented 1 year ago

@miguelp1986 I think you can just add an faked lidvid to one of those collection inventories and try to rerun, e.g.

collection_test.xml:

P,valid_lidvid1::1.0
S,valid_lidvid2::1.0
S,foobar_fail1::1.0
S,foobar_fail2