mc2-center / mc2-center-dcc

Data coordination resources for CCKP (and MC2 in general)
0 stars 0 forks source link

Incorporate query for target folder Ids into upload process #30

Closed Bankso closed 7 months ago

Bankso commented 7 months ago

Issue To upload metadata manifests to Synapse using schematic, a targetId must be provided, which corresponds to a folder where the manifest CSV is stored.

A database containing SynIDs for Synapse projects and folders for each metadata type was previously generated (here) to support the ingress process

Moving forward, targetId acquisition should be integrated into upload, so a separate database doesn't need to be maintained and to limit the risk of uploads to incorrect target folders.

Proposed solution Query a Synapse table, to get target Synapse IDs associated with MC2 grant projects and a given type of metadata

Changes to existing framework Instead of an input CSV with manifest paths and targetIds as the primary input, I propose each input CSV row is composed of the following:

Where folder name is provided at the command line (e.g., publications, datasets, tools, etc.)

Optionally, the input sheet can be generated programmatically, using the grant number contained in the filename of split manifests. This could be done as follows: