YuLab-SMU / createKEGGdb

Create KEGG.db Package
57 stars 19 forks source link

Compatibility of createKEGGdb with keyType option of clusterProfiler::enrichKEGG function #12

Open SciLiciumTheo opened 1 year ago

SciLiciumTheo commented 1 year ago

Hello,

Thanks for this useful package!

I have some questions on what exactly is stored in the resulting KEGG.db, and how that relates to the options of clusterProfiler::enrichKEGG. enrichKEGG has an option keyType, which accepts kegg, ncbi-geneid, ncbi-proteinid or uniprot.


Background/context

I would like to have a solution for doing KEGG enrichment analysis, starting from gene SYMBOL. I want to be able to use the same solution from any arbitrary species.

From this reply https://github.com/YuLab-SMU/clusterProfiler/issues/108#issuecomment-336784558

KEGG id and ENTREZID are the same for only some of the species, but not always the same.

and this blog post https://guangchuangyu.github.io/2016/05/convert-biological-id-with-kegg-api-using-clusterprofiler/

A rule of thumb for the ‘kegg’ ID is entrezgene ID for eukaryote species and Locus ID for prokaryotes.

I conclude that kegg id are not reliable enough/not sufficiently well described for my use. I would thus prefer to use ncbi-geneid.


However, when opening the sqlite database created through createKEGGdb, I only see a field gene_or_orf_id in table pathway2gene.

Questions:

Than you in advance for your help, All the best