Closed MightyAx closed 3 years ago
This project has 2 cell suspensions of immune cells so is not very high priority for adding value to the HCA. I have put into the icebox
Stalled: need to ask the authors some questions about the dataset
The authors do not have information about the donors incl. whether they are living EU donors due to anonymous data/privacy rules. They got back with the 10X version. Will submit the metadata and expression matrices (not the raw fastq).
Assigning self as secondary reviewer!
This paper has both mouse and human 10x data. It is oddly sparse on details relating to the human 10x protocols. Please see the note on Analysis_File and the File_Source tab!
Mouse Data: There is mouse 10x data also included in this dataset. Is there a reason we aren't including it together with the human data?
Dissociation Protocol: I might add: After being cleaned with PBS, the tissue was digested in 5 ml full media containing collagenase IV (100 U, Sigma-Aldrich) for 20 min at 37 °C.
Information on Library Preparation and Sequencing Protocol didn't come from the publication (I assume from contributor communication?)
Analysis Protocol: I would add 'Unknown' as the data normalization method as they do not specify it.
Analysis File: (Important!) Pipelines use the file_source field for some indexing. I would add 'GEO' as the File_Source
Additionally, the matrix.tar file only includes Donor 1 and Donor 2 cells, so I would include the matrix_cell_count as 4438 cells.
After quality control, cell cycle filtering and normalization, we analyzed 4438 IL-10-producing T cells pooled from donor 1 and donor 2.
Apart from that it looks great!
Great, thanks @Wkt8 . As discussed earlier, I won't include the mouse data for various reasons. About the dissociation protocol: the sequencing data is from the PBMCs only; I think the intestinal tissue which gets dissociated was used for other experiment types. Please let me know if you disagree!! It is a bit unclear. I just remember that "buffy coat" is obtained directly from blood in a tube after it is centrifuged (have had to do this in a lab, yuck!). I made the other changes you suggested. And yes, I got the library prep. and sequencing method info. from the authors directly and from GEO :)
@ami-day does this not need to be wrangled to SCEA?
This also needs an update to file_source in order to be indexed and displayed as having matrix files in the Data Portal (other components use file_source as the field to do this).
@ami-day does this not need to be wrangled to SCEA?
I think this is unsuitable for SCEA because there are just 2 samples that are sequenced and they are both IL-10+ CD4+ T-cells from healthy PBMCs. In their guidelines they say they expect at least 3 replicate samples although some exceptions can be made for e.g. rare tissue samples or other novel sample types. I think because the cells are from healthy PBMCs and blood is a commonly used biomaterial for scRNA-Seq it does not suit their criteria.
This also needs an update to file_source in order to be indexed and displayed as having matrix files in the Data Portal (other components use file_source as the field to do this).
I will try this now! I had added the file source, but when I add it using an allowed value, the project metadata would not validate, which I believe is a problem with validation in ingest. So I removed it in order to validate the project.
@Wkt8 I have made the update and it says it is valid now, so will re-submit the metadata only.
Dataset/group this task is for: GSE121267
Wrangler responsible for this dataset/lab: Ami Google sheet: https://docs.google.com/spreadsheets/d/13ABCY3Nzesr0BlCBVBvTtetth9N3Jr9O/edit#gid=877068780
Paper Molecular and functional heterogeneity of IL-10-producing CD4+ T cells
Description of the task: