Open jordanpadams opened 5 years ago
I haven't fully digested the email discussion, but just to summarise - the intention is that any target which appears in an observational product should be replicated in the parent collection and/or bundle reference list? For other types of contexts products this makes sense (investigation, host etc.) but for targets it could get... messy.
Currently I have been assuming that in the PSA we would curate the bundle label reference lists for the primary mission target(s), and not for every single target that we may have observed during a long cruise, or calibration targets etc.
I'm not again it, per se, but because we dynamically update our bundle and collection labels with every product ingestion, this would take some database work to implement etc.
@msbentley the primary search scenario here is someone trying to "browse" the archive for any bundle / collection that may contain products that looked at target "X" (e.g. a browsing feature of some kind). some may be serendipitous, during a long cruise, etc., but we can't say for sure that could not influence the science. I imagine this will be a WARNING message because of the scenario you are mentioning. But we wanted to at least make people aware in the event they intended to include all targets in the bundle/collection.
thoughts?
Yeah, I can kinda see that, but I guess I'm assuming that most people would hit a search enging/the registry and look for products with a given target. But I guess if they're coming through Google Dataset Search or something that's bringing them in at bundle or collection level, then it would be useful. If it's warning only, fine with me!
Is there a test resources for this issue?
@jordanpadams Is there a test resources for this ticket or should I try to come up with one, although I am NOT even sure where to start. It would help to have concrete examples to work from.
further engineering details:
Going back to the diagram here: https://github.com/NASA-PDS/registry-api/issues/458:
Ignoring the arrows on the right side of this diagram for a moment, for referential integrity checking purposes, validate already checks that “What collections belong to this bundle” and “What products belong to this collection”. Those classes already exist and have been successfully performing the referential integrity checking.
What we are saying for this ticket is, starting from the bottom of the tree, all context objects referred to in Products I, J, and K should be in Collection X, all context objects in Product L should be in Collection Y, etc.
Going up the tree, all context objects referred to in Collections X, Y, and Z should be referenced in Bundle A.
This sounds good. I hadn't thought to check this before.
We consider the missing references discussed here will always raise WARNING.
Is this a fail? product_observational references a target not referenced in bundle or collection. validate flags the bundle but not the collection. Much more subtle: the collection references a target that none of its products reference. val69.zip
It's probably a fail. The revised test has these lid_references to targets: bundle: saturn narvi collection: saturn titan data: saturn narvi
% validate -R pds4.bundle -t val69b
PDS Validate Tool Report
Configuration: Version 2.2.0-SNAPSHOT Date 2021-10-25T23:45:19Z
Parameters: Targets [file:/Users/rchen/Desktop/val69b/] Rule Type pds4.bundle Severity Level WARNING Recurse Directories true File Filters Used [.xml, .XML] Data Content Validation on Product Level Validation on Allow Unlabeled Files false Max Errors 100000 Registered Contexts File /Users/rchen/PDS4tools/validate/resources/registered_context_products.json
Product Level Validation Results
PASS: file:/Users/rchen/Desktop/val69b/bundle-vg1-sat-pos-l1coords-1.0.xml 1 product validation(s) completed
PASS: file:/Users/rchen/Desktop/val69b/data-sedr/SEDR_L1.xml 2 product validation(s) completed
PASS: file:/Users/rchen/Desktop/val69b/data-sedr/collection-data-sedr-1.0.xml 3 product validation(s) completed
PDS4 Bundle Level Validation Results
PASS: file:/Users/rchen/Desktop/val69b/data-sedr/collection-data-sedr-1.0.xml 1 integrity check(s) completed
PASS: file:/Users/rchen/Desktop/val69b/bundle-vg1-sat-pos-l1coords-1.0.xml WARNING [warning.integrity.missing_context_reference] This file should reference 'urn:nasa:pds:context:target:satellite.saturn.narvi' because its child product with LIDVID urn:nasa:pds:vg1-saturn-pos-l1coords:data-sedr:sedr-l1::1.0 references it. WARNING [warning.integrity.missing_context_reference] This file should reference 'urn:nasa:pds:context:target:satellite.saturn.titan' because its child product with LIDVID urn:nasa:pds:vg1-saturn-pos-l1coords:data-sedr::1.0 references it. 2 integrity check(s) completed
PASS: file:/Users/rchen/Desktop/val69b/data-sedr/SEDR_L1.xml 3 integrity check(s) completed
Summary:
0 error(s) 2 warning(s)
Product Validation Summary: 3 product(s) passed 0 product(s) failed 0 product(s) skipped
Referential Integrity Check Summary: 3 check(s) passed 0 check(s) failed 0 check(s) skipped
Message Types: 2 warning.integrity.missing_context_reference
End of Report Completed execution in 4662 ms
@qchaupds @jordanpadams Another probable point of failure: context products all have LIDs urn:::context:..., i.e. look for "context". The attached should generate no warnings or errors.
% validate -R pds4.bundle -t val308a PDS Validate Tool Report Configuration: Version 2.2.0-SNAPSHOT Date 2021-10-26T02:33:46Z Parameters: Targets [file:/Users/rchen/Desktop/test/val308a/] Rule Type pds4.bundle Severity Level WARNING Recurse Directories true File Filters Used [.xml, .XML] Data Content Validation on Product Level Validation on Allow Unlabeled Files false Max Errors 100000 Registered Contexts File /Users/rchen/PDS4tools/validate/resources/registered_context_products.json Product Level Validation Results PASS: file:/Users/rchen/Desktop/test/val308a/bundle-voyager1-pls-sat-1.0.xml 1 product validation(s) completed PASS: file:/Users/rchen/Desktop/test/val308a/browse-ion-moments/collection-browse-ion-moments-1.0.xml 2 product validation(s) completed PASS: file:/Users/rchen/Desktop/test/val308a/browse-ion-moments/ION_MOM.xml 3 product validation(s) completed PASS: file:/Users/rchen/Desktop/test/val308a/data-ion-moments-96sec/collection-data-ion-moments-96s-1.0.xml 4 product validation(s) completed PASS: file:/Users/rchen/Desktop/test/val308a/data-ion-moments-96sec/ION_MOM.xml 5 product validation(s) completed PDS4 Bundle Level Validation Results PASS: file:/Users/rchen/Desktop/test/val308a/browse-ion-moments/collection-browse-ion-moments-1.0.xml 1 integrity check(s) completed PASS: file:/Users/rchen/Desktop/test/val308a/bundle-voyager1-pls-sat-1.0.xml WARNING [warning.integrity.missing_context_reference] This file should reference 'urn:nasa:pds:vg1-pls-sat:data-ion-moments-96sec:ion-mom' because its child product with LIDVID urn:nasa:pds:vg1-pls-sat:browse-ion-moments:ion-mom::1.0 references it. WARNING [warning.integrity.missing_context_reference] This file should reference 'urn:nasa:pds:vg1-pls-sat:browse-ion-moments:ion-mom' because its child product with LIDVID urn:nasa:pds:vg1-pls-sat:data-ion-moments-96sec:ion-mom::1.0 references it. 2 integrity check(s) completed PASS: file:/Users/rchen/Desktop/test/val308a/data-ion-moments-96sec/collection-data-ion-moments-96s-1.0.xml 3 integrity check(s) completed PASS: file:/Users/rchen/Desktop/test/val308a/data-ion-moments-96sec/ION_MOM.xml 4 integrity check(s) completed PASS: file:/Users/rchen/Desktop/test/val308a/browse-ion-moments/ION_MOM.xml 5 integrity check(s) completed Summary: 0 error(s) 2 warning(s) Product Validation Summary: 5 product(s) passed 0 product(s) failed 0 product(s) skipped Referential Integrity Check Summary: 5 check(s) passed 0 check(s) failed 0 check(s) skipped Message Types: 2 warning.integrity.missing_context_reference End of Report Completed execution in 6189 ms val308a.zip
@rchenatjpl created a new ticket to track this at #430
Motivation
...so that we can enable quality search results when searching at the collection/bundle level
Additional Details
Reverted per https://github.com/NASA-PDS/validate/pull/456
All unique context objects specified in observational products must be referenced in the Reference_List of the parent collection and bundle. These context objects are referenced from:
Acceptance Criteria
Given a bundle with target identified as X When I perform validation of the bundle (
-R pds4.bundle
) with products that have targets X and Y Then I expect validate to throw a WARNING that all targets are not specified in the parent bundleGiven a collection J with target identified as X When I perform validation of the bundle containing collection J (
-R pds4.bundle
) with products that have targets X and Y Then I expect validate to throw a WARNING that all targets are not specified in the parent collectionGiven a collection J with target identified as X When I perform validation of the bundle containing collection J (
-R pds4.bundle
) with products that have targets X and Y with--skip-context-reference-check
flag Then I expect validate to NOT throw a WARNINGEngineering Details
Some background: Per email chain with @mitchgordon, @lynnneakrase, @rsjoyner, @rchenatjpl , the issue arose regarding specifying numerous targets within a data collection. There were several other alternatives, including specifying the targets in the context collection and specifying a planetary_system instead of the individual targets, however, it was determined the best way to specify targets at the collection/bundle level is to explicitly add all targets to the bundle/collection label Reference_List. This solution should apply across all context objects.
missing_context_reference
include in messageDisable with --skip-context-reference-check flag.