satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 915 forks source link

Percentage of gene markers in sample #6818

Closed atlmalive closed 1 year ago

atlmalive commented 1 year ago

Hi all,

Please see #6765 and #6770 and #371

I am following the seurat vignette and was trying to figure out how to get the percentage of markers expressed in a named cluster with the prop.table command.

**#### count gene marker count #### DefaultAssay(obj) <- "RNA" table(Idents(obj))

selected markers

genes <- cd_genes

check if selected markers are present in the object

genes %<>% intersect(rownames(obj)) thres = 0 # threshold for expression level, eg > 3 counts mac = c(0,2) cells.out <- lapply(seq(genes), function (i) { which(FetchData(obj, slot = "data", cell = WhichCells(obj, idents = mac), vars = genes[i]) > thres, arr.ind = TRUE, useNames = TRUE) %>% rownames %>% length } ) names(cells.out) <- genes # add names to a list gmark_count <- cells.out %>% as.data.frame.list %>% t() %>% prop.table() write.csv(gmark_count, "results/genepercent.csv") ##saves data vertically**

The command generates percentages but the number generated is not correct. For instance, CD14 is 4 out of 218 (i.e.0.0183) but running this script results in 0.003 for CD14.

I looked at issue #371, I ran the function by @Ryan-Zhu PrctCellExpringGene(obj ,genes =c("CD14"), group.by = "seurat_clusters") which gave the correct percentages.

Markers Cell_proportion Feature 0 CD14 0.006849315 0 1 CD14 0.007142857 1 4 CD14 0.021978022 4 2 CD14 0.014705882 2 5 CD14 0.024390244 5 3 CD14 0.017543860 3

I am only interested in idents 0 and 2. Could this function be used to combine clusters and get the marker percentage for them?

alikhuseynov commented 1 year ago

gmark_count gives you total number of cells expressing a given gene in idents = c(0,1) gmark_count %>% prop.table() * 100 will convert them to percentages, then to check gmark_count %>% prop.table() %>% sum should always = 1

if #371 works, that's great. else a simplistic solution could be to make a DotPlot and extract what is needed.

pl1 <- DotPlot(obj, features = genes, idents = c(0,1))
pl1$data %>% str 
# pl1$data$pct.exp is what you need

another alternative is to check this function ?Seurat::FoldChange

best

saketkc commented 1 year ago

You can use the FoldChange command to get the pct.exp values.