YuLab-SMU / clusterProfiler

:bar_chart: A universal enrichment tool for interpreting omics data
https://yulab-smu.top/biomedical-knowledge-mining-book/
1k stars 252 forks source link

Can do kegg analysis for organisms not in KEGG supported organisms? #127

Open liuxianghui opened 6 years ago

liuxianghui commented 6 years ago

Dear GuangChuang: I have two bacteria organisms. Klebsiella pneumoniae Kp-1 (https://www.ncbi.nlm.nih.gov/nuccore/CP012883.1) and Pseudomans protegens Pf-5 (https://www.ncbi.nlm.nih.gov/nuccore/CP000076.1). Both are not supported in kegg. I want to do some kegg analysis for RNA-seq data for them. I tried to fins some annotation in eztaxon website. It contains information like KEGG ID. (could you kindly suggest other databases I can refer to download the annotations, better with GO and kegg pathway information?) The format of file looks like this (tab separated file)

CDS name Other name(s) EggNog ID Kegg ID Product Function Note Length Location
GCA_000465975.2_00001 ncd2|npd S:COG2070 K00459 Nitronate monooxygenase Flavoprotein; FMN; Monooxygenase; Oxidoreductase. Catalyzes the oxidation of alkyl nitronates to produce the corresponding carbonyl compounds and nitrites; Belongs to the nitronate monooxygenase family.; KEGG: kpu:KP1_2696 nitronate monooxygenase 1059 497..1555

I am thinking of pathway enrichment analysis for that. Since it is not supported, I can not use enrichKEGG. Can I still use universe classifier enricher to do that by providing mapping information of kegg pathway vs locusid? Of course I may need to convert the kegg_id to kegg_pathway id. Right? Please please kindly suggest,
Xianghui

Prerequisites

Describe you issue

Ask in right place

liuxianghui commented 6 years ago

Dear GuangChuang: Thank you for using the ko for analysis of organisms not existed in KEGG organisms. I works on bacteria and some are not in KEGG organisms and barely have any annotations... no GO and no KEGG... Anyway I can work it for KEGG pathway enrichment analysis. The only one limitation is when I try to plot the KEGG pathway with pathview. I am unable to put the correct fold change data on the map. I guess it is because we use K number. Multiple genes will have the same K number... Do you kindly have a solution for that?

brightbio commented 6 years ago

using enricher to perform GO analysis. see https://guangchuangyu.github.io/2015/05/use-clusterprofiler-as-an-universal-enrichment-analysis-tool.

Stepmata commented 4 years ago

Dear GuangChuang: Thank you for using the ko for analysis of organisms not existed in KEGG organisms. I works on bacteria and some are not in KEGG organisms and barely have any annotations... no GO and no KEGG... Anyway I can work it for KEGG pathway enrichment analysis. The only one limitation is when I try to plot the KEGG pathway with pathview. I am unable to put the correct fold change data on the map. I guess it is because we use K number. Multiple genes will have the same K number... Do you kindly have a solution for that?

Hi! I'm trying to use clusterprofiler to make my kegg enrichment analysis from a data set of a non-model organism, and when a make the analysis with enrichKEGG( ) function a get this error message: ca_kegg <- enrichKEGG(ca_list, organism = 'ko', keyType = 'kegg', universe = BBRB_KEGG, pAdjustMethod = "BH") --> No gene can be mapped.... --> Expected input gene ID: K00895,K01810,K21622,K16370,K15779,K01218 --> return NULL...

In this case ca_list is my list of DE gene ID's and BBRB_KEGG is a dataframe of two columns with gene ID's and KEGG annotations that I get with Trinotate.

How could I solve this problem and what means that "gene can be mapped"? Thank you!