ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

GSE121638 - Mapping the immune environment in clear cell renal carcinoma by single-cell genomics (ImmuneRenalCarcinoma) #305

Open Wkt8 opened 3 years ago

Wkt8 commented 3 years ago

Primary Wrangler: Wei Secondary Wrangler: Enrique Associated files: Google Drive: https://drive.google.com/drive/folders/1gTWohslenXt_hKb7J7PMBUQrKhWHcDEx?usp=sharing Project: https://contribute.data.humancellatlas.org/projects/detail?uuid=955dfc2c-a8c6-4d04-aa4d-907610545d11

Published study links Paper: https://www.nature.com/articles/s42003-020-01625-6 Accessioned data: GSE121638

Key Events

Wkt8 commented 3 years ago

Run in ingest-graph-validator and uploaded the spreadsheet to staging: https://staging.contribute.data.humancellatlas.org/submissions/detail?id=607edf745ba51701d5bac1a0&project=6c86381e-3e3e-45dc-a51c-138e74130949

Spreadsheet is also in the google drive.

Waiting for secondary review. Note to whoever secondary reviews this that I believe the cell suspensions in GEO have been modelled incorrectly, and I've modelled it following the protocols in the paper.

The key difference is that in GEO, there are multiple samples (T cell from renal cancer tissue, CD45+ cells from renal cancer tissue) which I believe are actually libraries. This is because the protocols followed do not have a specific step to sort for T cells, apart from the library preparation protocl (using the 10X VDJ T cell enrichment kit).

ESapenaVentura commented 3 years ago

Hi @Wkt8 ! I have reviewed the dataset and I have a couple of notes:

Project - Contributors

Project - Funders

Donor

Specimen

Enrichment protocol

Cell suspension

Sequencing protocol Ontologies - I have added the ontology/ontology label for instrument and sequencing method

Sequence file

Schemas tab

Overall the experiment design LGTM, I think you did a great job modelling VDJ! I have uploaded the updated spreadsheet with my corrections to the folder.

Please check further for possible missing ontologies, I have triple checked the fields but something might have escaped my eyes. This type of missing info won't cause trouble in ingest (since ontology and ontology_label are not required) but may cause problems downstream!

Wkt8 commented 3 years ago

Thanks very much @ESapenaVentura!! Will check further for the ontologies.

ofanobilbao commented 3 years ago

@Wkt8 moved to Finished in this board, as it looks as done from DCP perspective. Amend if I did not get it right. Thanks!

ami-day commented 2 years ago

I have pre-converted the MAGE-TAB files and put them here: https://drive.google.com/drive/folders/1n96Q3Ftws3h2ZxmqJr3zpxSCY_VtWTqF They require checking and manual curation.

I assigned them with E-HCAD-53.

The files are missing the 10X TCR samples and data. I am not yet sure if the technology is eligible for SCEA, I need to ask them or find an example on the SCEA portal. Either way, I think the dataset would need to be split by technology type so new files would need to be generated for the TCR data if it is eligible.

ami-day commented 1 year ago

This has already been started by Wei, SCEA Gitlab branch id: E-HCAD-44 https://gitlab.ebi.ac.uk/ebi-gene-expression/scxa-metadata/-/merge_requests/225

ami-day commented 1 year ago

Made corrections to E-HCAD-44 in Gitlab. Waiting for Silvie's review.

ami-day commented 1 year ago

Handed over to SCEA team (Gitlab) - review required.