YuLab-SMU / GOSemSim

:golf: GO-terms Semantic Similarity Measures
https://yulab-smu.top/biomedical-knowledge-mining-book/
58 stars 26 forks source link

not compatible with STRSXP error #11

Closed chenhao392 closed 7 years ago

chenhao392 commented 7 years ago

Hi Guangchuang,

I'm really glad to find your work on calculating GO semantic similarities. But when I tried to use your package, I found your recent update is not working with some measure methods on my machine, other than the default "Wang" measure. I am using your example in the reference manual to demonstrate.

So, if I run the following with "Wang" measure, the function is working fine.

d <- godata('org.Hs.eg.db', ont="MF", computeIC=FALSE) geneSim("241", "251", semData=d, measure="Wang")

$geneSim [1] 0.141

$GO1 [1] "GO:0005515" "GO:0047485" "GO:0050544"

$GO2 [1] "GO:0004035"

However, if I change the measure into one of the other four methods, such as "Resnik", I get the same "not compatible with STRSXP" error.

d <- godata('org.Hs.eg.db', ont="MF", computeIC=FALSE) geneSim("241", "251", semData=d, measure="Resnik")

Error in infoContentMethod_cpp(ID1, ID2, .anc, IC, method, ont) : not compatible with STRSXP

It seems to be a general problem, affecting most of the functions in this package. Please let me know if you have any suggestions.

Best, Hao

aaronwolen commented 7 years ago

I ran into this too. I think the problem is caused by passing an empty numeric vector to infoContentMethod_cpp() when computeIC=FALSE. Flipping this argument in godata() to TRUE fixes the issue for me.

GuangchuangYu commented 7 years ago

@aaronwolen is right. Of course, you need to pre-compute IC using something like:

d <- godata('org.Hs.eg.db', ont="MF", computeIC=TRUE)
chenhao392 commented 7 years ago

@aaronwolen, thanks a lot! The problem fixed. @GuangchuangYu. Thanks. I realized it's not making much sense for using Resnik's method without computing IC. In a long run, I think it might be better to close this option "Resnik + not computing IC" or mention it in the manual, especially for impatient users like me.

GuangchuangYu commented 7 years ago

yes, more user friendly error message is needed.

updated version will throw error like this:

> geneSim("241", "251", semData=d, measure="Resnik")
Error in infoContentMethod(t1, t2, method = method, semData) :
  IC data not found, please re-generate your `semData` with `computeIC=TRUE`...