Closed ami-day closed 4 years ago
I didn't add a milestone, I guess we can discuss in our next stand-up tomorrow
Hi @ami-day, I have reviewed the spreadsheet and I have the following comments:
General
Project
Project - Contributors
Project - Publications
Project - Funding source(s)
Specimen from organism
Cell suspension
cell_size
field?Sequence files
SRR7130925.fastq
, I think we are only accepting gzipped fastq filesSRR7130925.fastq
is a little bit odd, is it only 1 read file?Dissociation protocol
retail_name
field is a string; No need to put double pipes (schema) although I’m not sure how it will be better to separate.Library preparation protocol
Supplementary file
Happy to go through any doubt you have tomorrow :)
Hi @ESapenaVentura,
I have finished making all the review changes we discussed, and your 'get ontology' script was super helpful.
Would it be possible to do a final review on the updated version (same file name and location)?
@mshadbolt and @zperova: Enrique and I were unsure about the end bias and tag bias options in the 'Library Prep Protocol' tab and the 'Sequencing protocol' tab; it would be great to know your thoughts on this.
The completed metadata sheet is located here: https://drive.google.com/drive/folders/1sA4mDAzvAkCAv8e8LYZPW7qkpT_4pRo8/COMPLETED Humphreys et al - Single-Cell Transcriptomics of a Human Kidney Allograft Biopsy Specimen.xlsx
Thank you
Tested the spreadsheet in staging and there are no validation errors.
A couple of notes, though:
Project short name - Needs to be “computer-readable” (No spaces, no special chars)
21 year-old donor: There are 2 diseases in text but only one in ontology/ontology label. This won't fail validation but will result in a length 2 array with ontology only for the first item. Same with specimen from organism derived from this donor. Example here:
Collection protocols: Looks like both collection protocols are the same but just applied to different donors?
Selected cell types: There are 5 types of cells listed in text but only 1 in ontology/label
Sequence files:
SRR6506830_2.fastq.gz
SRR6506831_1.fastq.gz
SRR6506831_2.fastq.gz
SRR6506832_1.fastq.gz
SRR6506832_2.fastq.gz
SRR6506833_1.fastq.gz
SRR6506833_2.fastq.gz
Should have the same process_id (They all come from tube 4)
Library_prep protocol: Input nucleic acid molecule should be "polyA RNA extract” instead of mRNA. Change ontology and ontology label as well
Don’t know anything about inDrops but please check about the end bias. Other inDrops projects have been ingested with “3 prime tag” instead of “3 prime end bias”.
Hey @ESapenaVentura, I made the above changes, added ontologies using fill_ontologies.py and re-uploaded the file using the new project short name as the file name.
Could we put this through validation again to ensure I didn't break anything?
Where is the spreadsheet? I have looked everywhere but I am not sure which one is the most updated one
@ESapenaVentura Here it is, I had changed the file name to the project short name: https://drive.google.com/drive/folders/1sA4mDAzvAkCAv8e8LYZPW7qkpT_4pRo8
This is ready to validate and ingest so I am closing the issue now.
Hi, I completed the metadata fields for GEO datasets GSE114156 and GSE109564 which are both associated with the following publication by Humphreys et al.: "Single-Cell Transcriptomics of a Human Kidney Allograft Biopsy Specimen Defines a Diverse Inflammatory Response".
It would be great if this could be reviewed? @mshadbolt @zperova @ESapenaVentura. Here is the file location: https://drive.google.com/drive/folders/118kh4wiHmn4Oz9n1-WZueaxm-8XuCMkA.
There was already a filled-in sheet for GSE109564 in finished projects, so I copied that info. over into the combined sheet.
Originally posted by @ami-day in https://github.com/HumanCellAtlas/metadata-schema/issues/1210#issuecomment-578778628