ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

CPmicroEnvironment update #1293

Open arschat opened 3 weeks ago

arschat commented 3 weeks ago

Project was published in DCP but donor metadata is inconsistent between DCP vs publication/GEO (34 vs 12 donor).

Project short name:

CPmicroEnvironment

Primary Wrangler:

Arsenios

Secondary Wrangler:

Associated files

Published study links

Key Events

arschat commented 3 weeks ago

Donor information of original submission seem to be incorrect.

The number of donors according to Table 1 is 3+9=12 while in the original DCP submission is 34. It seems that each library listed in GEO had it's own donor while for each donor there were 3 different library protocols (scRNA-seq, CITE-seq, TCR-seq) with the exception of donors BL12, Idio4 and BL10, Idio2 who did not have CITE-seq (3 * 12 - 2 = 34 GSM accession).

It is now modelled with 12 entities in donor, specimen, cell_suspension.

arschat commented 3 weeks ago

New submission is now in graph valid. We need to decide if we want to keep file & protocol uuids or not:

  1. new file & protocol uuids
  2. old file & protocol uuids
arschat commented 3 weeks ago

will proceed with solution number 2. @arschat to create a spreadsheet with updates and uuids, run first part of script and once good, share with @ESapenaVentura

arschat commented 2 weeks ago

hca-util-upload-area uuid: 7292c116-0ada-457f-8fa7-833c817674f2

arschat commented 2 weeks ago

I created a spreadsheet using the updated biomaterial metadata but all the existing uuids (biomaterials, processes, protocols, files).

arschat commented 5 days ago

Enrique fixed a typo I made in cell_suspensions uuids, we roll back the files schema version to the last version using data: prefix since EDAM: prefix does not yet work in prod

sequence_file: from 10.0.0 to 9.6.0 analysis_file: from 8.0.0 to 7.0.0

Fixed some graph invalid errors (specimen to file instead of cell suspension to file), fixed project title to match publication, cleaned staging area from previous submission and exported and import form sent.