HumanCellAtlas / data-operations

MIT License
0 stars 1 forks source link

Project Completion QA Checklist: TissueStability #5

Open jychien opened 4 years ago

jychien commented 4 years ago

Project UUID: c4077b3c-5c98-4d26-a614-246d12c2e5d7 Project Title: Ischaemic sensitivity of human tissue by single cell RNA seq Project Short Name: TissueStability Submission UUID: fd52efcc-6924-4c8a-b68c-a299aea1d80f Environment: Production

This project is also know as the "Meyer dataset" or "TissueSensitivity" and was re-ingested to incorporate additional data from the contributor.

jychien commented 4 years ago

Process QA found the following issue:

  1. After downloading mtx for the project, barcodes.tsv was found to have redundant barcodes in the file. Matrix is currently implementing a change to use cell ID rather than barcodes for the file HumanCellAtlas/matrix-service/issues/428

  2. This project contains bulk RNAseq and WGS data. For these assay types, there are no cell suspensions since the specimen does not undergo a dissociation protocol. Sequence files are directly linked to specimen. This experimental graph may be an issue for downstream components, and will need to be followed up on before a pipeline is made available in production. (Related slack thread)

jahilton commented 4 years ago

validator results... cell_suspension.cell_morphology.cell_viability_method:Trypan blue; manual haemacytometer should be reviewed for consistency across projects library_preparation_protocol.input_nucleic_acid_molecule.ontology:OBI:0000869 is polyA RNA extract but input as polyA RNA library_preparation_protocol.library_construction_method.ontology_label:10X v2 sequencing needs the more specific 3' or 5' term library_preparation_protocol.library_construction_method.ontology_label:DNA library construction & cDNA library construction don't feel parallel to the other values in this field project.insdc_project_accessions:ERP114453 does not match pattern project.insdc_study_accessions:PRJEB31843 does not match pattern sequence_file.file_core.format:fastq.gz not in ['fastq'] sequencing_protocol.10x.pooled_channels:4.0 does not match pattern (expecting integer)

the polyA RNA is across-the-board issue so should bee decided on for all projects. All others have been noted in https://github.com/HumanCellAtlas/hca-data-wrangling/issues/355

jychien commented 4 years ago

The paper is studying affects of cold ischaemic time affects on scRNA-seq analysis. There should be a timecourse module attached to the biomaterial.