hagenaue / CellTypeAnalyses_Darmanis

This repository contains the full code for the cell type analyses applied to the Darmanis et al. single-cell RNAseq data in the manuscript (http://biorxiv.org/content/early/2016/11/25/089391) "INFERENCE OF CELL-TYPE COMPOSITION FROM HUMAN BRAIN TRANSCRIPTOMIC DATASETS ILLUMINATES THE EFFECTS OF AGE, MANNER OF DEATH, DISSECTION, AND PSYCHIATRIC DIAGNOSIS" by *Megan Hastings Hagenauer, Ph.D.1, Jun Z. Li, Ph.D.2, David M. Walsh, Psy.D.3, Marquis P. Vawter, Ph.D.3 Robert C. Thompson, Ph.D.1, Cortney A. Turner, Ph.D.1, William E. Bunney, M.D.3, Richard M. Myers, Ph.D.4, Jack D. Barchas, M.D.5, Alan F. Schatzberg, M.D.6, Stanley J. Watson, M.D., Ph.D.1, Huda Akil, Ph.D.1 1Mol. Behavioral Neurosci. Inst., Univ. of Michigan, Ann Arbor, MI, USA;  2Genet., Univ. of Michigan, Ann Arbor, MI, USA;  3Univ. of California, Irvine, CA; 4HudsonAlpha Inst. for Biotech., Huntsville, AL, USA; 5Stanford, Palo Alto, CA, 6Cornell, New York, NY, USA 
6 stars 2 forks source link

Availability of CellTypeSpecificGenes_Master3.csv #1

Open ishaanpbs opened 7 years ago

ishaanpbs commented 7 years ago

Could you please provide the CellTypeSpecificGenes_Master3.csv file used in the 03_CalculatingCellTypeIndices.R file.

hagenaue commented 7 years ago

Certainly! It can be found on my website: https://sites.google.com/a/umich.edu/megan-hastings-hagenauer/home/cell-type-analysis It is also included in the BrainInABlender package on my github site. Would you like the original Darmanis data too?
Perhaps I should make a data repository to accompany the code for each analysis.

ishaanpbs commented 7 years ago

Thanks a lot for the quick reply A data repository would be great, that ways people like me would have no issues with reproducing the analysis from your paper and having confidence in the method to apply it further. Thanks again

On Wed, Dec 21, 2016 at 12:09 PM, Megan Hagenauer notifications@github.com wrote:

Certainly! It can be found on my website: https://sites.google.com/a/umich.edu/megan-hastings- hagenauer/home/cell-type-analysis It is also included in the BrainInABlender package on my github site. Would you like the original Darmanis data too? Perhaps I should make a data repository to accompany the code for each analysis.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hagenaue/CellTypeAnalyses_Darmanis/issues/1#issuecomment-268578379, or mute the thread https://github.com/notifications/unsubscribe-auth/AVZGSfts4ImTWJe45YN_oPvNqWIm76xFks5rKV0_gaJpZM4LTFx1 .

hagenaue commented 7 years ago

Excellent. Will do. Thank you for your feedback! I would love to have someone double check my work.

hagenaue commented 7 years ago

Sorry for the delay - I'm a github novice, so I had to double check on file upload limits, command line code, etc.
Quick question: The code for analyzing the Darmanis et al. dataset begins with a sample info worksheet and the original .csv files for each sample publicly available on NCBI's GEO ( https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE67835).
Would you prefer I upload the concatenated version?

ishaanpbs commented 7 years ago

Having all the files in the same directory concatenated or otherwise works. I have a question regarding the text, in your supplementary you guys say that the cell type expression is evident from trap-seq data. This is an important point as everyone in the field is moving to sorting nuclei based on trapseq results and feel confident about it representing true signal. It would be great if you guys could substantiate on this. Which genes are deviating etc.

On Dec 22, 2016 7:50 AM, "Megan Hagenauer" notifications@github.com wrote:

Sorry for the delay - I'm a github novice, so I had to double check on file upload limits, command line code, etc. Quick question: The code for analyzing the Darmanis et al. dataset begins with a sample info worksheet and the original .csv files for each sample publicly available on NCBI's GEO ( https://www.ncbi.nlm.nih.gov/ geo/query/acc.cgi?acc=GSE67835). Would you prefer I upload the concatenated version?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/hagenaue/CellTypeAnalyses_Darmanis/issues/1#issuecomment-268793229, or mute the thread https://github.com/notifications/unsubscribe-auth/AVZGSUOkDhuqezW3Un5CVn0YvRdeET59ks5rKnIRgaJpZM4LTFx1 .

hagenaue commented 7 years ago

Let me make sure I'm answering your question correctly. Are you referring to the sentence on page 3 of "S1 Text.docx": "Occassionally, we found that the cell type indices associated with cell type specific gene lists derived from TRAP methodology (15) did not properly predict the cell identity of the samples"

If so, you can see an illustration of the issue in Suppl Figures 1-2 in "S2 Text.docx" - the cell type predictions based on the TRAP data include "Doyle" in their label (they are derived from the Doyle et al. 2008 publication). A good example of a cell type prediction that doesn't match cell identity are the Oligodendrocyte_All_Doyle_Cell_2008 based cell type index. The expression for these genes aren't elevated in cells previously identified as oligodendrocytes, whereas the expression of oligodendrocyte-related genes identified by a variety of other publications are quite elevated (Cahoy et al., Zeisel et al., Darmanis et al., Zhang et al. - although note that the last two are self-referential in two of these datasets). Within the main text, the correlation matrix in Figure 3 also illustrates how the predictions based on the cell type specific genes identified by Doyle et al. often don't match those derived from cell type specific genes identified by other publications.

I was a little hesitant to draw too much attention to it, but recently I was talking with Ogan Mancarci and Lilah Toker who dug deeper into a multitude of datasets characterizing cell type specific expression to create an interactive cleaned-up database Neuroexpresso (http://neuroexpresso.org, https://github.com/oganm/neuroexpresso, http://biorxiv.org/content/early/2016/11/24/089219), and they mentioned that they similarly ended up not including several of the TRAP datasets from Doyle et al. 2008 in their final analysis because they found evidence that they were contaminated with other cell types (not truly purified).