constantAmateur / SoupX

R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
255 stars 34 forks source link

decontaminating pancreatic dataset #23

Closed jmzvillarreal closed 4 years ago

jmzvillarreal commented 4 years ago

Hi, I am using SoupX to decontaminate a dataset of pancreatic cell in which acinar enzymes are contaminating non acinar cells. I have used soup specific genes to determine the fratuion of contamination and correcting the expression profile as follows:

WT_36Dir<- c("/local/ljmartinezv/sc_pancreas_M_Serrano/Final_analysis/WT/AL4936/") WT_36_CellID <- read.table('WT_36_CELLS', header = FALSE, sep= '\t') WT_36 <- load10X(dataDir = WT_36Dir, cellIDs = WT_36_CellID$V1, keepDroplets = TRUE) WT_36 <- estimateSoup(WT_36)

Soup specific genes

Soup_genes_36 <- head(WT_36$soupProfile[order(WT_36$soupProfile$est, decreasing = TRUE), ], n = 50) Soup_genes_36 <- rownames(Soup_genes_36)

Estimating non-expressing cells

useToEst_36 = estimateNonExpressingCells(WT_36, nonExpressedGeneList = list(Soup_genes_36))

Calculating the contamination fraction

WT_36 <- calculateContaminationFraction(WT_36, list(Soup_genes_36), useToEst = useToEst_36)

estimated global contamination fraction of 37.60%

Correcting expression profile

WT_36_decont <- adjustCounts(WT_36)

DropletUtils:::write10xCounts("./WT_36Counts", WT_36_decont)

Does that looks fine to you ? Thanks in advance, Jaime.

constantAmateur commented 4 years ago

It looks like you're just using the top 50 genes as expressed in the soup to determine the contamination fraction. You should not be doing this, doing so will over-estimate the contamination fraction (potentially by a lot, I doubt your contamination is really as high as 37%). The correct thing to do is to pick genes that you know should not be expressed in a set of cells. I'm not an expect in your context, but something like Insulin in Acinar cells, that you know shouldn't be there. See the vignette for an example.

Failing that, you're better off setting the contamination fraction to something reasonable (like 10%) and proceeding with that.