ctlab / fgsea

Fast Gene Set Enrichment Analysis
Other
379 stars 67 forks source link

Obtain per-sample expression value for a gene set #137

Open PatrickMaclean opened 1 year ago

PatrickMaclean commented 1 year ago

Hi - thank you for your great work on the package.

I'm sorry if I've overlooked something obvious, but how do I extract the per-sample scores for each geneset obtained through geseca?

I can run geseca with: gesecaRes <- geseca(pathways, E, minSize = 15, maxSize = 500)

and then plot the per-sample expression value for a specified geneset with: plotCoregulationProfile(pathway=pathways[["SANA_TNF_SIGNALING_UP"]], E=E, titles = metadata$aample, conditions = metadata$cohort)

How do I just obtain the data used by plotCoregulationProfile (sample x expression value for a specified gene set) so I can plot it more flexibly?

Thanks!

vdsukhov commented 1 year ago

@PatrickMaclean Hi

You could try to adapt the code that we are using inside of plotCoregulationProfile:

conditions <- # your conditions here or NULL
E <- t(base::scale(t(E), center=center, scale = scale))

genes <- pathway

if (!is.null(conditions)) {
    if (is.character(conditions)) {
        conditions <- factor(conditions, levels=unique(conditions))
    }
}

pointDt <- data.table(x = seq_len(ncol(E)),
                        y = colSums(E[rownames(E) %in% genes, , drop=FALSE]) / sum(rownames(E) %in% genes),
                        condition=if (!is.null(conditions)) { conditions  } else "x")

If you want to obtain information about the behavior of each gene in a gene set, take a look at how we handle it here

Let me know if something is unclear