Closed LNGDingj closed 2 years ago
Hi @LNGDingj ,
May I check a bunch of things:
fgsea
package installed in your R environment?deseq2
exist in pseudo.varm
?log2FoldChange
exist in pseudo.varm['deseq2']
? You may construct a pandas data frame by
df = pd.DataFrame(pseudo.varm['deseq2'], index=pseudo.var_names)
and then check if log2FoldChange
exists in df.columns
.
If everything looks good above, could you share the output of the following code:
from pegasus.tools import predefined_pathways, load_signatures_from_file
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
fgsea = importr("fgsea")
pwdict = load_signatures_from_file(predefined_pathways.get('canonical_pathways', 'canonical_pathways'))
pathways_r = ro.ListVector(pwdict)
log2fc = ro.FloatVector(pseudo.varm['deseq2']['log2FoldChange'])
log2fc.names = ro.StrVector(pseudo.var_names)
res = fgsea.fgsea(pathways_r, log2fc, minSize=15, maxSize=500, nproc=0)
print(res)
Sincerely, Yiming
Hi @yihming ,
Thanks a lot for your helps.
The issue that I reported is very likely caused by Mouse Genetic Nomenclature (The gene symbols for mice have the first letter capitalized followed by lower case letter) which is different from Human Genetic Nomenclature, I attached gene symbols for mouse genome. mouse_gene_symbols.txt
Here are the information you need, Do you have fgsea package installed in your R environment? --- Yes, the installation is without any problems, since all function calls in "Pseudobulk Analysis Tutorial" can be repeated with human genome data 'MantonBM_nonmix_subset.zarr.zip'
Does key deseq2 exist in pseudo.varm? --- Yes, 'deseq2' is in pseudo.varm
If so, then does key log2FoldChange exist in pseudo.varm['deseq2']? You may construct a pandas data frame by --- Yes, key log2FoldChange exist in pseudo.varm['deseq2']
print(res) --- Empty data.table (0 rows and 8 cols): pathway,pval,padj,log2err,ES,NES...
I see.
The preset gene sets that Pegasus provides are from MSigDB, and all MSigDB gene sets consist of human gene symbols.
To make GSEA work for your mouse data, you may find this discussion helpful.
Besides, it seems that MSigDB provides some mouse gene sets here. However, it's still in DRAFT status, and we haven't tested them.
If you have gene set gmt
file yourself, you can directly use it by specifying its file path in pathways
parameter of fgsea
function.
Otherwise, if you really want to compare your data with human genes, you may make the gene names in your data (i.e. pseudo.var_names
) all capitalized to "cheat" the system.
Sincerely, Yiming
Thanks a lot for all your helps!
Hello, @bli25 @yihming
The pegasus.fgsea does NOT work with MOUSE genome. I got errors while I was running pegasus.fgsea with pseudo-bulk data from pegasus.deseq2, could you help with this? Thanks so much!
Background information: Cloud Environment: R/Bioconductor: (Python 3.7.12, R4.1.2, Bioconductor 3.1.4, tidyverse 1.3.1) Pegasus 1.5.1 installed from GitHub Mouse snRNASeq My results of pseudo-bulk data(pseudo) from pegasus.deseq2 and plots from pegasus.pseudo.volcano look reasonable
Error messages from calling pg.fgsea(pseudo, 'log2FoldChange', 'canonical_pathways', 'deseq2', fgsea_key = 'fgsea_deseq2')