Closed malloryfreeberg closed 3 years ago
It may be more useful to tag bundles by the process that created them rather than the type of data they contain. The type of data can be inferred from the process and the process is more descriptive.
I have a few more notes about this here:
https://github.com/HumanCellAtlas/dcp-community/pull/86#issuecomment-530276016
@NoopDog how would this tagging look like in practice, and what would a subscription query look like to specifically match such bundles?
It is unclear why there are two RFCs related to bundle (#86 and #93).
They are not clearly delineated into covering different aspects of bundle types. I believe we would be better off merging these two.
@NoopDog how would this tagging look like in practice, and what would a subscription query look like to specifically match such bundles?
In practice the least disruptive way would be to add bundle level metadata fields for: process_fquid - the process instance that created the bundle protocol_id - the protocol that the process implements - a value form the protocol core schema
Then the user can subscribe to bundles specifying protocol_id along with other refining metadata.
@NoopDog I appreciate the intent but I think opaque process IDs would still make the system too complex and indirected to be usable.
Please note: I have merged the contents of #86 into this PR; please refer to the discussion in that PR for additional detail and comments addressed from reviewers.
I create issue #114 to get media types turned into a real RFC. That is good enough to not impact this PR.
@diekhans I already imported it here: https://github.com/HumanCellAtlas/dcp-community/pull/113
As a refinement and extension of this RFC I have created an alternate RFC linked here : RFC: HCA DCP Application Layer Bundle Types and Definitions .
The alternate proposal differs from this RFC mainly in that:
Type information is added to a new type.json metadata file rather than added to the DSS bundle.json file.
Type information is expressed in JSON rather than in RFC 7231 media type syntax.
The schema and bundle types are represented as JSON schema in the metadata schema repo and documented on the data portal rather than maintained in a DSS registry.
The proposed types are refined by making the process and protocol that created the bundle explicit in the type.json.
Your review/feedback is appreciated.
Pausing to merge with #119 and align with upcoming reproducibility and data citation requirements.
Obsolete
Please note: the contents of #86 were merged into this PR; please refer to the discussion in that PR for additional detail and comments addressed from reviewers.