yigbt / multiGSEA

The `multiGSEA` R package was designed to run a robust GSEA-based pathway enrichment for multiple omics layers.
Other
17 stars 7 forks source link

Could I create my own omics data for fungi #2

Closed iaunicorn closed 3 years ago

iaunicorn commented 4 years ago

Hi, Thanks for your package. I learned about multiGSEA supports 11 different organisms. I read this manual: https://bioconductor.org/packages/devel/bioc/vignettes/multiGSEA/inst/doc/multiGSEA.html. Now I would like to use this package to analyze my own data of fungi. I have a question here. If I could create three datasets (transcriptome, proteome and metabolome) according to the required input format, could I do this analysis without load the library of 11 different organisms ( such as library "org.Hs.eg.db"). I am not sure whether this step is necessary. Another problem, if I have omics data of human, do I must to load the library (org.Hs.eg.db)? Look forward to your reply.

boll3 commented 3 years ago

Hi!

The mapping of proteins and transcript identifiers is done by the AnnotationDbi package and depending on the species you want to analyze, you need different 'database' packages that provide the actual mapping information. So, for human omics data, you need the org.Hs.eg.db package to map the pathway-derived proteome and transcriptome features to a specific format such as Entrez Gene IDs or Gene symbols (the target ID format depends on the IDs in your omics data). Another necessity to do the mapping, is the fact that different pathway DBs provide different ID formats: KEGG pathways contain Entrez IDs while Reactome pathways contain UniProt IDs.

When you run the command

pathways <- getMultiOmicsFeatures( dbs = databases, layer = layers, returnTranscriptome = "SYMBOL", returnProteome = "SYMBOL", returnMetabolome = "HMDB")

please make sure that the specified return ID formats are those that are actually used in your (multi)omics data. The whole mapping step is solely necessary to convert the IDs used in the pathway databases to this ID format that is used in your omics data.

Unfortunately, fungi is not supported based on the following issue: The graphite R package, which is used to download the pathway definition doesn't support fungi at all, hence you are not able to collect any fungi-specific pathways.

Hope that helps a bit.

Feel free to ask again, if something is still unclear!