NASA-PDS / registry-api

Web API service for the PDS Registry, providing the implementation of the PDS Search API (https://github.com/nasa-pds/pds-api) for the PDS Registry.
https://nasa-pds.github.io/pds-api
Other
2 stars 5 forks source link

As an API user, I want to know the children and ancestors of bundle, collections, and products #458

Closed jordanpadams closed 3 years ago

jordanpadams commented 3 years ago

For more information on how to populate this new feature request, see the PDS Wiki on User Story Development:

https://github.com/NASA-PDS/nasa-pds.github.io/wiki/Issue-Tracking#user-story-development

Motivation

...so that I can "browse" a PDS4 bundle through the API, going up or down the tree through simple REST queries

Additional Details

Dependent upon https://github.com/NASA-PDS/pds-registry-app/issues/76

Screen Shot 2021-03-26 at 11 21 45 AM
al-niessner commented 3 years ago

Does the lidvid already describe the tree or is there metadata that defines the tree either through a single linked list (either up or down) or double linked list (up and down)?

al-niessner commented 3 years ago

Can a product or collection belong to more than one bundle? Do we wish to allow this in the future?

jordanpadams commented 3 years ago

@al-niessner

Does the lidvid already describe the tree or is there metadata that defines the tree either through a single linked list (either up or down) or double linked list (up and down)? not entirely sure what the question is here. all of this metadata is in the registry, we will just need to do various queries in order to jump up and down the tree. @tdddblog can probably help us with how to query the registry for each of the scenarios described in the sub-stories for this story. in some cases (e.g. bundle -> collection) the bundle LIDVID should return most of the metadata you need in order to know what collections belong to that bundle. for others (e.g. collection -> product) there is a separate index in ES for managing this information.

Can a product or collection belong to more than one bundle? Do we wish to allow this in the future? I tried to highlight this a bit within those stories, but the answer is yes, with a caveat. a product/collection can belong to only 1 bundle as a primary product, but many bundles as secondary. the indexing of secondary products was just implemented by @tdddblog per those PRs I added you as a reviewer for. (https://github.com/NASA-PDS/harvest/pull/47 and https://github.com/NASA-PDS/pds-registry-mgr-elastic/pull/21)

al-niessner commented 3 years ago

@jordanpadams

wrt lidvid names and do they describe the tree. What I was asking was the URN something like urn:domain:bundle:collection:product::version where the URN is terminated at the type of data like bundle or collection.

al-niessner commented 3 years ago

@jordanpadams

Still waiting for answer on lidvid question above this one.

Are bundles and collections real items or just virtual ones because they are just part of the namespace? For instance, can I have a bundle or collection with no products and have it change versions (assuming either has versions associated with them)? If I find a bundle without any collections or a collection without any products, what does that tell me other than an empty namespace?

jordanpadams commented 3 years ago

@al-niessner

wrt lidvid names and do they describe the tree. What I was asking was the URN something like urn:domain:bundle:collection:product::version where the URN is terminated at the type of data like bundle or collection.

the definition of the tree (e.g. who are my children?) are pulled from the bundle/collection metadata ingested into the registry. the metadata we extract from the data and ingest into the registry could be thought of as a top-down single linked list to identify children of a particular product. for identifying parents, we could probably do this a couple ways:

  1. All of the metadata for identifying bundle/collection children is searchable via the registry, so it should be pretty straightforward to modify our query logic to traverse back up the tree.
  2. The LIDVID actually has formation rules where we could figure out a product's parent collection/bundle, which is probably the easiest method. It just hurts me inside when we try to extract information from identifiers. That being said, it is explicitly called out in the PDS4 standard so we might as well take advantage of it. Here is the formation rule:
    urn:nasa:pds:<bundle_id>:<collection_id>:<product_id>::<version_id>

@tdddblog could we maybe plan to have a quick demo this afternoon:

Are bundles and collections real items or just virtual ones because they are just part of the namespace?

they are "real" products. everything in PDS4 is technically considered a product. some are more abstract than others, but they all need to be registered and considered as single entity interrelated with other entities within the system.

For instance, can I have a bundle or collection with no products and have it change versions (assuming either has versions associated with them)?

bundles and collections must have 1 or more products associated with them. however, I could imagine a case of an invalid ingestion where something fails and we query for products that belong to collection and get nothing back.

If I find a bundle without any collections or a collection without any products, what does that tell me other than an empty namespace?

bundles must have at least 1 collection, not sure about collections without any products. one additional note to clarify, XML namespaces are not related to bundle/collections/products. not sure if that is what you meant there.

* one minor caveat here is LIDVID formation rules do apply where you could kind of figure out bundle/collections/products

jordanpadams commented 3 years ago

@al-niessner sorry, i just saw your other question. i think my response should answer that as well, but we can talk about it some more this afternoon