Incorporate query for target folder Ids into upload process

Issue To upload metadata manifests to Synapse using schematic, a targetId must be provided, which corresponds to a folder where the manifest CSV is stored.

A database containing SynIDs for Synapse projects and folders for each metadata type was previously generated (here) to support the ingress process

Moving forward, targetId acquisition should be integrated into upload, so a separate database doesn't need to be maintained and to limit the risk of uploads to incorrect target folders.

Proposed solution Query a Synapse table, to get target Synapse IDs associated with MC2 grant projects and a given type of metadata

The CCKP all files table displays sufficient information to support this step
suggested query: SELECT id,name,parentId FROM syn27210848 WHERE name='folder name' AND parentId IN (comma-separated list of single-quoted project Synapse IDs)
Return query as a data frame
associate targetIds with manifest paths (this is current input format for manifest upload script)

Changes to existing framework Instead of an input CSV with manifest paths and targetIds as the primary input, I propose each input CSV row is composed of the following:

manifest paths
project Synapse IDs

Where folder name is provided at the command line (e.g., publications, datasets, tools, etc.)

Optionally, the input sheet can be generated programmatically, using the grant number contained in the filename of split manifests. This could be done as follows:

extract grant number from manifest path
query the grantview table to get the project Synapse ID
merge manifest path and project Synapse ID information

mc2-center / mc2-center-dcc

Incorporate query for target folder Ids into upload process #30