constantAmateur / SoupX

R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
253 stars 34 forks source link

Why do I get contamination fraction >100% #39

Closed chansigit closed 4 years ago

chansigit commented 4 years ago

I used the endothelial marker VWF to estimate the contamination fraction for a group of immune cells.

After running sc <- calculateContaminationFraction(sc=sc, nonExpressedGeneList=nonExpGlist, useToEst = useToEst_clustering)

I got Estimated global contamination fraction of 114.78%

How could that be possible? Why does the algorithm return a value more than 100% ?

constantAmateur commented 4 years ago

This indicates that the gene you are using to estimate the contamination is genuinely expressed in the cells where SoupX is being told (by the useToEst parameter) it should not be present. Have you manually constructed useToEst_clustering or is it being produced by estimateNonExpressingCells