GangLiLab / genekitr

🧬 Gene analysis toolkit based on R
https://www.genekitr.fun
GNU General Public License v3.0
53 stars 7 forks source link

Issue with ORA and importCP #24

Closed zganjarmia closed 1 year ago

zganjarmia commented 1 year ago

Hi, Firstly, kudos for the development of genekitr. It's a great tool and your reasons for its creation resonate so much with my experiences so far.

I'm currently working with the organism Yarrowia lipolytica and have noted some challenges:

  1. The geneset for Yarrowia lipolytica is available in the geneset package (GO and KEGG), but there is no organism value attached to it when running getGO:

    > mf <- getGO(org = "Yarrowia lipolytica", ont = "mf")
    > head(mf$geneset)
          mf          gene
    1 GO:0000030 YALI0_C04004g
    2 GO:0000030 YALI0_D10549g
    3 GO:0000030 YALI0_B01672g
    4 GO:0000030 YALI0_E02222g
    5 GO:0000030 YALI0_A20922g
    6 GO:0000030 YALI0_A13585g
    > head(mf$geneset_name)
          id                           name
    1 GO:0000030   mannosyltransferase activity
    2 GO:0000049                   tRNA binding
    3 GO:0000149                  SNARE binding
    4 GO:0000166             nucleotide binding
    5 GO:0000175 3'-5'-RNA exonuclease activity
    6 GO:0000287          magnesium ion binding
    > mf$organism
    [1] NA
  2. However, I ran into issues with follow-up functions, specifically the genORA function. It suggests there's no short name for the organism. This is perplexing given the initial inclusion of Yarrowia lipolytica in the geneset. I also tried to add the organism value, but the function still does not work.

    >   gs <- genORA(de.genes$ensembl_gene_id, mf$geneset,padj_method = "BH",
    +                p_cutoff = 0.05,)
    Error in if (organism == "hg" | organism == "human" | organism == "hsa" |  : 
    argument is of length zero
  3. I also tried a different route, performing the ORA with ClusterProfiler and then importing the results to genekitr. But this too resulted in an error.

    >   ora_go <- clusterProfiler::enrichGO(gene = de.genes,
    +                         OrgDb = org.Ylipolytica.eg.db,
    +                         universe = filtered_data$entrez,
    +                         keyType = "ENTREZID",  
    +                         ont = "ALL",  # Biological Process
    +                         pAdjustMethod = "BH",  # adjust method
    +                         pvalueCutoff = 0.05,
    +                         minGSSize = 5,
    +                         maxGSSize = 500,
    +                         readable = FALSE)
    >   go_easy <- importCP(ora_go, type = "go")
    Error in mapEnsOrg(object@organism) : 
    Check the latin_short_name in `genekitr::ensOrg_name`

    I'd appreciate any insights or suggestions you might have regarding these issues. Is there a workaround or am I possibly missing a step? Thanks!

reedliu commented 1 year ago

Hi, I have fixed the ORA issue and please update packages (geneset >= 0.2.8 and genekitr >= 1.2.5 ).

# test code
library(geneset)
mf <- getGO(org = "Yarrowia lipolytica", ont = "mf")
mf$organism
# "yarrowia"

library(genekitr)
id <- mf$geneset$gene[1:50]
ora <- genORA(id, mf,padj_method = "BH",p_cutoff = 0.05)
image

Then, you could pass the ora result to plot function:

# example
plotEnrich(ora, plot_type = "dot")
image

For the third question, could you send me the org.Ylipolytica.eg.db so I can reproduce and test?

reedliu commented 1 year ago

I will close the issue now. If you have any further feedback, please open this issue again.