The differentially expressed genes for each deconvolved cell-type

ShawnXzz commented 2 years ago

Hi all, I don't have single cell data. I want to use this tool to determine the cell type of my 10x visium data. I would like to get the top5 marker genes in each cluster instead of the top one. What should I do? Thanks!

bmill3r commented 2 years ago

Hi @ShawnXzz,

Thanks for using STdeconvolve and reaching out. STdeconvolve has a function called topGenes() to extract the top n genes for each deconvolved cell type based on their predicted gene expression distributions (i.e. the beta matrix output).

Following the mOB dataset as an example:

library(STdeconvolve)
## load built in data
data(mOB)
pos <- mOB$pos
cd <- mOB$counts
annot <- mOB$annot
## remove pixels with too few genes
counts <- cleanCounts(cd, min.lib.size = 100)
## feature select for genes
corpus <- restrictCorpus(counts, removeAbove=1.0, removeBelow = 0.05)
## choose optimal number of cell-types
ldas <- fitLDA(t(as.matrix(corpus)), Ks = seq(2, 9, by = 1))
## get best model results
optLDA <- optimalModel(models = ldas, opt = "min")
## extract deconvolved cell-type proportions (theta) and transcriptional profiles (beta)
results <- getBetaTheta(optLDA, perc.filt = 0.05, betaScale = 1000)
deconProp <- results$theta
deconGexp <- results$beta

## now get the top 5 genes for each of the deconvolved cell types:
topGenes(deconGexp, n = 5)

With respect to annotating the deconvolved cell types, I would also suggest the function annotateCellTypesGSEA(). For 10X data, a good place to start would be the vignette, which shows 2 different ways of accessing 10X datasets. Check out the sections: SpatialExperiment inputs, Annotation Strategy 1, and Annotation Strategy 2

Hope this helps, Brendan

ShawnXzz commented 2 years ago

Hi @ShawnXzz,

Thanks for using STdeconvolve and reaching out. STdeconvolve has a function called topGenes() to extract the top n genes for each deconvolved cell type based on their predicted gene expression distributions (i.e. the beta matrix output).

Following the mOB dataset as an example:
library(STdeconvolve)
## load built in data
data(mOB)
pos <- mOB$pos
cd <- mOB$counts
annot <- mOB$annot
## remove pixels with too few genes
counts <- cleanCounts(cd, min.lib.size = 100)
## feature select for genes
corpus <- restrictCorpus(counts, removeAbove=1.0, removeBelow = 0.05)
## choose optimal number of cell-types
ldas <- fitLDA(t(as.matrix(corpus)), Ks = seq(2, 9, by = 1))
## get best model results
optLDA <- optimalModel(models = ldas, opt = "min")
## extract deconvolved cell-type proportions (theta) and transcriptional profiles (beta)
results <- getBetaTheta(optLDA, perc.filt = 0.05, betaScale = 1000)
deconProp <- results$theta
deconGexp <- results$beta

## now get the top 5 genes for each of the deconvolved cell types:
topGenes(deconGexp, n = 5)
With respect to annotating the deconvolved cell types, I would also suggest the function annotateCellTypesGSEA(). For 10X data, a good place to start would be the vignette, which shows 2 different ways of accessing 10X datasets. Check out the sections: SpatialExperiment inputs, Annotation Strategy 1, and Annotation Strategy 2

Hope this helps, Brendan

Hi! Both strategies you suggested are great, thanks for your help!

JEFworks-Lab / STdeconvolve

The differentially expressed genes for each deconvolved cell-type #17