mc2-center / mc2-center-dcc

Data coordination resources for CCKP (and MC2 in general)
0 stars 0 forks source link

Design and write a curation + ingress workflow script #36

Closed Bankso closed 6 months ago

Bankso commented 7 months ago

Discussed with @aditigopalan 1/24/24 - we would like to put together a shell script (or github workflow) that strings our suite of curation and ingress scripts together.

This will help streamline work by introducing a reusable/reproducible pipeline that can be run on a regular basis.

Pubmed crawler will remain separate from this workflow

High-level workflow steps (starting with manifests containing metadata from multiple grants) are outlined below, but will need to be expanded on.

Updated 2/26/24 to reflect current progress

Status: shell script written for this part of the process

Status: all scripts are written, but have not been written into a pipeline yet

Status: subset of scripts are written. A workflow for this is lower priority, so we can address it later.

*== script not yet written, step performed manually == script written, requires updates ^== step has not been implemented manually or programmatically

aditigopalan commented 7 months ago

MC2 Curation and Ingress steps workflow here

Bankso commented 7 months ago

PR #39 adds initial version of workflow shell script and associated readme. Let's plan on updating the shell script to reflect any missing steps and (if it makes sense) converting it to a github action config

Bankso commented 6 months ago

Updated description to reflect current status on this. Note that the database sync was implemented via PR https://github.com/mc2-center/mc2-center-dcc/pull/53

aclayton555 commented 6 months ago

This got larger than expected, opportunity to break this down further and align on what still needs to be implemented. @aditigopalan will structure this work accordingly in tickets for our kick off later this week.