satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.31k stars 919 forks source link

Mitochondrial genes in Canonical Marker List #7913

Closed TeodoraTockovska closed 1 year ago

TeodoraTockovska commented 1 year ago

Hi Seurat dev team,

Thanks for creating such an amazing package! I have a quick question regarding my canonical marker list.

I have 10 spatial transcriptomics samples that were pre-processed to remove barcodes with high mitochondrial content (some samples had around 75%) and outliers that had enormous UMIs detected (a couple of barcodes had 100,000 UMIs, clear outliers in the violin plots, when the average UMIs were around 20,000). After QC, the samples were normalized, then integrated into 1 object using Harmony and I performed clustering. To perform cell type annotation on my clusters, I ran FindAllMarkers with the following parameters: min.pct=0.25 and only.pos=TRUE. I've noticed that there are significant mitochondrial genes that show up in my canonical marker list for a few clusters. Is this normal? The samples don't have high mitochondrial content after filtering (between 2-10% at most, depending on the sample).

Why would some of my clusters have significant mitochondrial genes? Should I ignore them?

Thanks, Teodora

longmanz commented 1 year ago

Hi, One thing you can do prior to RunPCA/RunUMAP is to regress the mito-genes effect out during your ScaleData() step by: ScaleData(, vars.to.regress = "mito.percent") ('mito.percent' should be substituted by the column name of your mitochondrial gene percentage). This should mitigate the impact of mito-genes during your PCA/UMAP/clustering analyses.

To validate that you have successfully mitigated the effect, you can check the genes related to your top PCs and few/none of the mito-genes should be included.