ebi-ait / hca-ebi-wrangler-central

This repo is for tracking work related to wrangling datasets for the HCA, associated tasks and for maintaining related documentation.
https://ebi-ait.github.io/hca-ebi-wrangler-central/
Apache License 2.0
7 stars 2 forks source link

PRJNA379992 - Single cell RNA sequencing to dissect the molecular heterogeneity in lupus nephritis #1063

Closed idazucchi closed 1 year ago

idazucchi commented 1 year ago

Project short name:

Primary Wrangler: @anu-shiva

Secondary Wrangler:

Associated files

Published study links

Key Events

ESapenaVentura commented 1 year ago

Need contributor contact:

idazucchi commented 1 year ago

The dataset is stuck in graph validation - we are waiting on dev fixes

arschat commented 1 year ago

@Wkt8 to help graph validate locally

anu-shiva commented 1 year ago

Graph valid. ready for secondary review

Wkt8 commented 1 year ago

Assigning self for secondary review

Wkt8 commented 1 year ago

Donor_Organism: Under the 'disease' field, enter 'normal' for the donor_organisms that are marked as healthy instead of leaving them blank

Would be great if we could incorporate the stage of lupus (II/III/IV/V) into the donor_organism description

Collection_Protocol: Not necessary, but I would split up the skin collection and renal collection protocols into two different protocols, and use the ontology term for 'biopsy' (https://ontology.archive.data.humancellatlas.org/ontologies/efo/terms?iri=http%3A%2F%2Fwww.ebi.ac.uk%2Fefo%2FEFO_0009120) for the skin collection and EFO:0009293 (percutaneous kidney biopsy) for the kidney collection protocol.

Specimen_from_organism There's a specimen for the 'healthy blood' from 'healthy controls' but there is no link to it to a specific organism. I saw that there is a BioSamples accession for the sample Metro_PBMC (SAMN06661569) but there's not really any information attached. Nonetheless, we need to be consistent and have the specimen_from_organism linked to a donor_organism even if the donor_organism has minimal metadata so that the downstream systems are able to parse the data.

I would recommend creating a 'donor_31_Metro_PBMC' donor, and linking the Metro_PBMC specimen to it.

Or alternatively, removing the blood specimen all together as it doesn't seem to be linked to any further cell suspension or sequence file.

Library_preparation I'm a bit confused about the Fluidigm-C1 based library preparation, but we have a row in the assay cheat sheet ( https://docs.google.com/spreadsheets/d/1H9i1BK-VOXtMgGVv8LJZZZ9rbTG4XCQTBRxErdqMvWk/edit#gid=0) which states that it is 'full length' end bias and only generates 1 fastq per cell, which matches up to the sequence_file tab. Maybe change the end bias to match that? Also not sure where you got the information for the cell barcode and umi barcode lengths, please share as I couldn't find it!