Closed hanbyulcho closed 4 years ago
Based on the plots you've posted it looks like none of the genes would be suitable for estimating the contamination rate. I would not suggest proceeding with estimation using any of those genes, as you will likely get an inflated estimate of the contamination rate and do more harm than good to your data.
Are there any other genes that you think might work for your particular case? The algorithmic genes shown by plotMarkerDistribution
are just a heuristic and can miss useful genes. Have you tried HB genes?
If none of these are suitable I would recommend either not doing any contamination correction, or trying a range of sensible values (2%-10%) and seeing what effect it has on your downstream analysis.
In my project, genes that are highly specific to just one population of cells (Immune, luminal, fibroblast, endothelial) are not known at all.
Thus, I made plot with top 60 candidates of soup specific genes by using "plotMarkerDistribution" described in detailed vignette. However, it is hard to see bimodal graph expect TGM4 (located middle in top 20 candidates).
which gene could be the soup specific genes with no biomodal graph in this case?
In detailed vignette, all immunoglobulin genes are used since IGCK and IGLC2 were chosen from the plot. However, that is not the case in my data. How can I make the list with TGM4 with no biological information? should I use TGM4 only as the soup specific gene?
I attached my plot for better understanding.
top1-20
top21-40
top 41-60
Thanks,