zqfang / GSEApy

Gene Set Enrichment Analysis in Python
http://gseapy.rtfd.io/
BSD 3-Clause "New" or "Revised" License
553 stars 115 forks source link

what is the difference between prerank and single sample #103

Closed AEDWIP closed 2 years ago

AEDWIP commented 4 years ago

I am newbie

I read https://www.gsea-msigdb.org/gsea/doc/GSEAUserGuideFrame.html. There is a discussion about prerank. I did not see anything about single sample?

I am working with a single bulk gene expression data set (mRNA). I plan to use prerank() and TPM values

thanks

Andy

p.s. Is there a better place to ask questions like this? Also you might consider adding a link to the FAQ to the documentation

zqfang commented 4 years ago

I don't know where the better place is. may be gitter?

In short,

The statistic between prerank (GSEA) and ssGSEA are different. Assume that we have calculated each running enrichment score of your ranked input genes, then

Yep, I'll improve that

AEDWIP commented 4 years ago

thanks Zqfang. I really appreciate the work you have done on GSEApy.

a few more minor questions.

I am using prerank() on Gene expression data. I am using the GO-biologic processes data set. prerank () generates about 39 png files. I looked through the code but could not figure out why out of all the possible pathways plots these 29 were generated? What makes them special?

I have been filtering the result based on 'fdr' and 'pvalue, the sorting by 'nes'. None of my top pathways get plotted automatically. It was not obvious how I could generate the enrichment plot my self.

Minor request for enhancement. the pathway names in the results csv file do not match the names of the plots that are automatically generated. This makes hard to programmatically cause them to be displayed in my juypter notebook

Kind regards

Andy

zqfang commented 4 years ago

prerank() just generate the top n figures for you (same behavior to GSEA). The program is slow when generate a lot of plots. If you have 40 gene_sets, you could set graph_num=40.

You could also do your own plots in jupyter notebook if your pathway is not plotted. see here: https://gseapy.readthedocs.io/en/latest/gseapy_example.html#3.2-How-to-generate-your-GSEA-plot-inside-python-console

For the names of the plots, I have to strip some unnecessary characters (causing problems) when saving files. Let me see what I can do