dviraran / SingleR

SingleR: Single-cell RNA-seq cell types Recognition (legacy version)
GNU General Public License v3.0
263 stars 98 forks source link

Getting Started #72

Open WhataShane opened 5 years ago

WhataShane commented 5 years ago

Hello!

Thank you so much for working on this library. I'm a high school student interning at a local lab, and was having trouble adapting the tutorial, and wanted to reach out to see if you might be able to offer any advice. My own background is principally in computer science, and less so in immunology.

I have two .tsv data files related to RNA-seq, and I'm trying to annotate clusters in a TSNE I generated using Seurat.

One, barcodes.tsv (100kb), looks as follows: AAACCCAAGAGAGGTA-1 AAACCCAAGAGATTTA-1 etc...

The other, genes.tsv (700kb), looks like this:
ENSMUSG00000025950 Idh1 ESMUSGG00000025950 Ign3 etc...

There's a matrix.mtx file, filled with numbers formatted like so:

27921 1 8

I tried initializing a SingleR object with the following:

singler = CreateSinglerSeuratObject("/directory/housing/tsvFiles", project.name = "analysis", min.genes = 500, technology, species = "Mouse", normalize.gene.length = F, min.cells = 2, npca = 10, regress.out = "nUMI", reduce.seurat.object = T)

However, the the estimated time remaining is currently 3 days and 4 hours. I'm running this on a 2015 MacBook Air. First off, did I run the CreateSeuratObject command properly, and do I need any additional files/metadata to use SingleR? If everything's good, are my files just too large to process in one go?

Thanks so much in advance!

dviraran commented 5 years ago

Hi,

Thank you for your kind words.

First, SingleR is a tool to assist in annotating the cell types from a single-cell RNA-seq experiment. It is not a tool for any other scRNA-seq analyses. For getting started with scRNA-seq I strongly suggest getting to know Seurat or alternative packages (scater, monocle, etc.). Seurat have very nice 'getting started' tutorials.

The function you are using 'CreateSinglerSeuratObject' is just a function that automates the Seurat pipeline before using SingleR to annotate the cells.

According to your description, your data comes from a cellranger analysis and the data is probably acquired with the 10X platform. Seurat has a function called Read10X that reads those three files and returns a matrix.

Finally, regarding your question - yes, it seems you have quite a big dataset. I provide a function 'CreateBigSingleRObject' that you can try see this tutorial.

Good luck with your analysis.

Best, Dvir

WhataShane commented 5 years ago

Thank you so much for getting back to me and for the well wishes. I have one follow-up question: if I provide SingleR with the path to the directory housing my 10X data files, will SingleR be able to parse them? Or do I first need to first read them with Seurat's Read10X? I really appreciate the help! I'm going to check out the tutorials you linked now. Thanks again!

Best. Shane

dviraran commented 5 years ago

Yes, you can use the SingleR functions with the path to the 10X directory, but as you said the data is too big. Its definitely better to first create a Seurat object, and use the data in the object as input (see case 2 in the 'Create object' vignette or CreateBigSingleRObject tutorial).

Best, Dvir

WhataShane commented 5 years ago

Awesome, I ran everything successfully. Thank you once more for all the help!

Best, Shane