dandi / dandi-schema

Schemata for DANDI archive project
Apache License 2.0
7 stars 10 forks source link

Helper API to provide metadata item for capturing provenance of a derived data file #190

Open yarikoptic opened 1 year ago

yarikoptic commented 1 year ago

E.g. as now came up in the context of the spike sorted data in https://github.com/dandi/dandi-cli/issues/1314 but also is to be used in the use case of @vandermeerlab (attn @TheChymera @manimoh) . I guess it should take a metadata record, original asset(s) information, and return enhanced metadata record. @satra you have mentioned that we had worked that out that somewhere.

Then with that helper in mind, we should provide documentation for https://github.com/dandi/handbook/ .

satra commented 1 year ago

yes, we worked that out for publishing dandisets. it takes the metadata adds the provenannce: https://github.com/dandi/dandi-archive/blob/6cc74347458f1e12a434ac95ccd417ef48fbe4da/dandiapi/api/services/publish/__init__.py#L90

in the context of dandi-schema, one could use the schema library itself to inject derived from: asset_meta.wasDerivedFrom = [other_asset.id] if the field is blank or append if not.

however, right now the wasDerivedFrom is only supporting Biosample derivations to assets (https://github.com/dandi/dandi-schema/blob/fe96019d065fa3a0bb455c3e2bbc07dd4208e4fa/dandischema/models.py#L1391). we should add URI as an option.