Closed Osynchronika closed 4 years ago
SoupX does not require you to specify which genes are contaminants. All genes will exhibit some degree of contamination, although for many genes the amount will be vanishingly small.
If you have particular genes you suspect are contaminating different populations, I suggest you run plotMarkerMap
to see their distirbution.
Hello,
Thanks for creating the SoupX, it seems like a nice package for accounting for the background noise in 10X data. I have a small question though. I have two single cell samples of a tissue from healthy and diseased patient, and I know a bunch of cell-type specific genes that contaminate the "soup". I could successfully remove them with your algorithm. However, I also see in each sample a number of general contaminants (like lncRNAs and splicing regualators, most of them are also highly expressed), that are sample-specific. I assume that they are contaminants, as I see them in the "empty droplets". When I look for DE genes between the samples, I of course find those contaminants in the top of the lists for all the clusters, but as I said, I assume they are just artifacts. I haven't fully understood how to deal with those general contaminant genes that are present in all cell types in the SoupX. Do I just run calculateContaminationFraction() on the list of these genes on all the cells I have? I tried that, but it didn't seem to do much. Or is there a different way to handle this?
Thanks in advance!