ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

Wrangling of new dataset: HumanAdiposeTissue GSE128890 #35

Closed ami-day closed 3 years ago

ami-day commented 4 years ago

Dataset/group this task is for:

GSE128890

Wrangler responsible for this dataset/lab:

A wrangler working on operations this sprint

Description of the task:

ami-day commented 4 years ago

Came across an issue identifying whether the single human sample is derived "from the abdominal subcutaneous adipose tissue of a 31 y.o. human female" as it states in the GEO soft file, or whether tissue from 8 human donors was pooled as implicitly suggested in the publication. Will email contributing author.

ami-day commented 4 years ago

Asked Patrick Searle this: "One more question I had is about cell type selection using FACS. In the methods the following markers are used to enrich for human stromal vascular cells: CD26, ICAM1, CD45, CD31, CD142. Which of these markers, if any, is a positive selection marker? or are they all used for enrichment of SVCs by negative selection?"

Reply: "For the SVC sequencing, we excluded cells expressing CD45 and CD31 (Lineage-). The other markers were used as positive selection markers for sorting cells in downstream validation/functional analysis (CD26, ICAM1, CD142)."

ami-day commented 4 years ago

asked Patrick Searle this: "the metadata file downloaded from the GEO database (accession: GSE128890) suggests there is just 1 Human abdominal subcutaneous adipose tissue sample which is derived from a 31 y.o. female (id: GSM3717979). However, in the publication methods section the collection of 8 Human adipose samples is described. It would be great to know whether the 8 adipose tissues were pooled prior to single cell RNA sequencing, or whether the sequencing data is available for only 1 of these donors."

Reply: "The sequencing data is only available for 1 donor. We did validation and follow up by other methods in a larger panel."

ami-day commented 4 years ago

The complete metadata sheet based on the publication, GEO metadata and email exchange with Patrick Searle can be found here: https://drive.google.com/drive/folders/1dOTCQ1TAoQm4Pm0VjUM1lxfXryWYL89h

This metadata sheet needs:

ami-day commented 4 years ago

Ami to add remaining ontologies first

mshadbolt commented 4 years ago

Secondary review: Project - contributors

Collection protocol

Specimen from organism

Enrichment protocol

Sequence file Are we sure that the four sets of I1, R1, R2 from cell suspension SRX5692097 should be analysed separately? i.e. do we know that these are cell suspensions that were split into libraries rather than a single library being split? Just need to confirm this to ensure the process id is grouping the right set of files together and ideally copy this into the library preparation id column.

I think for the other two cell suspensions that they should have all the same proces id as they are the same cell suspension sequenced on different lanes. but maybe I have misunderstood the structure of the experiment.

ami-day commented 4 years ago

Thanks @mshadbolt I have made the suggested changes.

For the last point about the process ids: I re-read the methods and there is no specification about whether cell suspensions were divided into technical replicates before library preparation or whether a single library preparation was divided into technical sequencing run replicates. I decided to replace the run accessions in the process ids column with the experiment accessions, as I would think the same experiment accession indicates the runs are technical replicates.

ESapenaVentura commented 4 years ago

@ami-day this was included in the export so I am guessing we can close the ticket now?

ami-day commented 3 years ago

This project has been exported, so closing this ticket.

ami-day commented 3 years ago

Preparing for SCEA with ID E-HCAD-20 (human) and E-HCAD-24 (mouse)

mshadbolt commented 3 years ago

I think this ticket should be in finished, it was exported as part of MVP and is in the browser here: https://data.humancellatlas.org/explore/projects/42d4f8d4-5422-4b78-adae-e7c3c2ef511c?catalog=dcp2