ctlab / fgsea

Fast Gene Set Enrichment Analysis
Other
379 stars 68 forks source link

Question: Visium Data and pulling out cells with high GO enrichment for a given pathway #127

Open DirtyHarry80 opened 2 years ago

DirtyHarry80 commented 2 years ago

Hi, I am using your package on Visium data as described in your recent manual, it works very fine! Related to that, I am wondering if there is a way to pull out identities of spots that have a particularly high GO enrichment (for a particular pathway or process)? In other words, where are the results of "fgsea" stored in Seurat object?

vdsukhov commented 2 years ago

@DirtyHarry80 Hi

I will consider the code from our vignette for analysis of spatial datasets (https://bioconductor.org/packages/release/bioc/vignettes/fgsea/inst/doc/geseca-tutorial.html#analysis-of-spatial-rna-seq). In our vignette we first of all perform enrichment analysis and after that to plot the expressions per pathway we use the following code:

topPathways <- gesecaRes[, pathway] |> head(4)

for (ppn in topPathways) {
    pp <- pathways[[ppn]]
    pp <- intersect(pp, rownames(E))

    score <- colSums(brain@assays$SCT@scale.data[pp, ])/sqrt(length(pp))
    brain@meta.data[[ppn]] <- score
}

SpatialFeaturePlot(brain, features = topPathways, )

it measures the pathway expression per each spot. After that, the values can be found in the metadata of object (brain@meta.data). For example:

brain@meta.data[, "KEGG_RIBOSOME", drop = FALSE]

Coordinates for each spot could be found in the following attribute - brain@images$anterior1@coordinates.

Pedramto89 commented 1 year ago

Hi @vdsukhov I could generate some results for my data (Visium) but I do not know why it shows only two pathways and for each has two images? It's because of showing only top pathways? How can I visualize the pathways of interest?

assaron commented 1 year ago

@Pedramto89 I've pushed some updates to the plotting code and the vignette. Please, check it out (you need to install the package from github with devtools::install_github('ctlab/fgsea')), may be it will resolve your question. If not, please provide more details: what code you're running and what it produces.

Pedramto89 commented 1 year ago

Thank you @assaron This is the code I use: `devtools::install_github('ctlab/fgsea')

library(fgsea) library(Seurat) library(SeuratData) library(ggplot2) library(patchwork) getwd() data <- readRDS(file = "savedobject.rds")

data <- RunPCA(data, assay = "SCT", verbose = FALSE, rev.pca = TRUE, reduction.name = "pca.rev", reduction.key="PCR_") E <- data@reductions$pca.rev@feature.loadings library(msigdbr) pathwaysDF <- msigdbr(species = "human", category = "H") pathways <- split(pathwaysDF$gene_symbol, pathwaysDF$gs_name)

set.seed(1)

gesecaRes <- geseca(pathways, E, minSize = 15, maxSize = 500, center = FALSE) topPathways <- gesecaRes[, pathway] |> head(10)

for (ppn in topPathways) { pp <- pathways[[ppn]] pp <- intersect(pp, rownames(E))

score <- colSums(data@assays$SCT@scale.data[pp, ])/sqrt(length(pp)) data@meta.data[[ppn]] <- score } SpatialFeaturePlot(data, features = topPathways, )` and what I get is the image I uploaded. I changed the head from 4 to 10 and as you see, the legend is so overlapped and I am not sure about the legend scale. It shows what?Also, I still have this question that how can I see the enrichment of a pathway of interest? Thank you! 10pathways

assaron commented 1 year ago

@Pedramto89 try to use the new plotCoregulationProfileSpatial as in here: https://github.com/ctlab/fgsea/blob/master/vignettes/geseca-tutorial.Rmd The values there are z-scores of the average gene expression of the gene set.

The titles can be shortened for convenience, and you can use cowplot::plot_grid (or other function) to arrange the plots.

topPathways <- gesecaRes[, pathway] |> head(10)
titles <- sub("HALLMARK_", "", topPathways)
ps <- plotCoregulationProfileSpatial(pathways[topPathways], obj,
                                       title=titles)
cowplot::plot_grid(plotlist=ps[1:4], ncol=2)

An individual pathway can be plotted as this:

plotCoregulationProfileSpatial(pathways$HALLMARK_OXIDATIVE_PHOSPHORYLATION,
                               obj,
                               title=sprintf("HALLMARK_OXIDATIVE_PHOSPHORYLATION (pval=%.2g)",
                                             gesecaRes[
                                                 match("HALLMARK_OXIDATIVE_PHOSPHORYLATION", pathway),
                                                 pval]))
Pedramto89 commented 1 year ago

Thanks so much for the prompt response @assaron It worked! Just wanted to let you know that the legend has been removed. Still scaling and interpreting is the question. 4new

Pedramto89 commented 1 year ago

Hi I tried it after a few months and get this weird image: I do not know why some parts of the image is not showing. Do you think is there any problem with my code? `library(msigdbr) library(fgsea) library(Seurat) library(SeuratData) library(ggplot2) library(patchwork) set.seed(1) E <- data@reductions$pca.rev@feature.loadings E <- obj@reductions$pca@feature.loadings pathwaysDF <- msigdbr(species = "human", category = "H") pathways <- split(pathwaysDF$gene_symbol, pathwaysDF$gs_name)

gesecaRes <- geseca(pathways, E, minSize = 15, maxSize = 500, center = FALSE) topPathways <- gesecaRes[, pathway] |> head(10) titles <- sub("HALLMARK_", "", topPathways) ps <- plotCoregulationProfileSpatial(pathways[topPathways], obj, title=titles) cowplot::plot_grid(plotlist=ps[1:4], ncol=2)

topPathways <- gesecaRes[, pathway] |> head(10) titles <- sub("HALLMARK_", "", topPathways) ps <- plotCoregulationProfileSpatial(pathways[topPathways], obj, title=titles) cowplot::plot_grid(plotlist=ps[1:4], ncol=2)

plotCoregulationProfileSpatial(pathways$HALLMARK_OXIDATIVE_PHOSPHORYLATION, obj, title=sprintf("HALLMARK_OXIDATIVE_PHOSPHORYLATION (pval=%.2g)", gesecaRes[ match("HALLMARK_OXIDATIVE_PHOSPHORYLATION", pathway), pval])) ` image

assaron commented 1 year ago

@Pedramto89 the images look OK on the first glance. What exactly are you missing?