10XGenomics / cellranger

10x Genomics Single Cell Analysis
https://www.10xgenomics.com/support/software/cell-ranger
Other
342 stars 91 forks source link

Reproducing filtered cells with emptyDrops #112

Closed HuynhNPT closed 3 years ago

HuynhNPT commented 3 years ago

Hi,

I have been trying to reproduce the filtered_feature_bc_matrix result with DroppletUtils::emptyDrops, starting from the raw_feature_bc_matrix. However, I have consistently been getting more cells. I was wondering what is the difference here, that I could change?

This is my code:

library(knitr)
library(DropletUtils)
library(dplyr)
sce <-read10xCounts("path/")
counts <- counts(sce)
e.out <- emptyDrops(counts)
is.cell <- e.out$FDR <= 0.001
is.cell.no.na <- is.cell
is.cell.no.na[is.na(is.cell)] <- FALSE
# 
ce <-colData(sce)$Barcode[is.cell.no.na]

Compared to the cells from cellranger count. We picked up ~ 60% more cells (2800 vs 1600).

evolvedmicrobe commented 3 years ago

The emptydrops algorithm has changed a bit since it was first ported over into Cell Ranger and the results have since diverged slightly. You can likely step through the R and Python code (https://github.com/10XGenomics/cellranger/blob/master/lib/python/cellranger/cell_calling.py) to pinpoint the exact differences you're observing.