constantAmateur / SoupX

R package to quantify and remove cell free mRNAs from droplet based scRNA-seq data
253 stars 34 forks source link

How SoupX remove ambient RNA expression when two or more gene sets are used? #33

Closed YiweiNiu closed 4 years ago

YiweiNiu commented 4 years ago

Hi,

Thank you for this nice package!

The tutorial gives an example using IG genes as nonExpressedGeneList. I wonder what SoupX would do if two or more gene sets are feed into nonExpressedGeneList?

Say cell 1 is a red blood cell, and cell 2 is a B cell. If we use nonExpressedGeneList = list(HB = c("HBB", "HBA2"), IG = c("IGKC")), and would SoupX use IG genes to estimate cell 1 while HB genes for cell 2? If true, could we go further and say that it would be more accurate if more (accurate) gene sets are given?

I am new to single-cell RNA-seq. Sorry if my question are too naive.

Many thanks! Yiwei Niu

constantAmateur commented 4 years ago

When multiple gene sets are provided, the estimation will be more accurate, provided those gene sets are appropriately selected. But unless you have a specific need (such as requiring precise cell specific contamination estimates) one good gene set will be plenty of information in most cases.