Closed aditigopalan closed 3 weeks ago
24-6 close out: aiming to upload May, April and June by mid 24-7/8 sprint. Will likely include a data model release during mid sprint
mid-sprint:
Publications: pending annotation for tumor type + assay (as discussed in the meeting mostly due to restricted access, but some on open access) Datasets : pending annotations for tissue
This ticket tracks curation workflow progression.
Note: It is possible for work to take place simultaneously in the three sections with overlapping periods, allowing curation workflows for different months to coincide.
1. Curation and Annotation
[x] Run Pubmed crawler to generate PublicationView manifest [205 publications generated, long sprint anticipated]
[x] Send Amber and Jineta a copy of the PublicationView manifest from latest crawl to review for MC2 Center Newsletter publication highlights
[x] Send Amber "News from CCKP" for MC2 Center Newsletter
[x] Annotate publications in PublicationView manifest [In progress]
[x] Generate ToolView and DatasetView manifests based on PublicationView manifest
[x] Run the automated curation workflow to upload publications, datasets and tools [This includes splitting manifests, processing and validating manifests, generating target synapse IDs for upload, schema updates, upload to synapse and (in progress) a validation check for uploads)
[x] Generate UNION tables
[x] QC of staging tables
[x] Performing automate portal sync to CCKP
[x] Validate data on the CCKP
Status check [Plan to report numbers for each category following pubmed crawl]:
[x] Publication upload
2. Data model [No new updates]
- [ ] Update valid values in the data model and build- [ ] Generate templates from new model- [ ] Release new model version [no changes, no release]- [ ] Update DCA config [no changes, no update]