YuLab-SMU / GOSemSim

:golf: GO-terms Semantic Similarity Measures
https://yulab-smu.top/biomedical-knowledge-mining-book/
58 stars 26 forks source link

GOSemSim: restricting similarity measures to sub-ontologies? #14

Closed GRamstein closed 5 years ago

GRamstein commented 7 years ago

Hello,

I am currently using your R package GOSemSim (version 2.2.0) to calculate GO similarities. However, I am quite confused about the restriction of GO measurements by subontology, when using Wang's method to compute similarities. When using the function mgoSim or goSim, I would expect to get a valid similarity value only if GO IDs correspond to the same sub-ontology, but similarities are identical when specifying any value ("MF", "BP" or "CC") for the ont argument, in godata. Below is some code that illustrates the issue.

require(GOSemSim)

GOSemSim v2.2.0

semData

d.BP <- godata(OrgDb = NULL, ont = "BP", computeIC = FALSE) d.MF <- godata(OrgDb = NULL, ont = "MF", computeIC = FALSE) d.CC <- godata(OrgDb = NULL, ont = "CC", computeIC = FALSE)

goSim

go <- "GO:00055114" # BP goSim(go, go, semData = d.BP) # 1 goSim(go, go, semData = d.MF) # 1 goSim(go, go, semData = d.CC) # 1

mgoSim

go1 <- c("GO:0016491", "GO:0055114") # MF, BP go2 <- c("GO:0000724", "GO:0006281", "GO:0030915") # BP, BP, CC mgoSim(go1, go2, semData = d.BP) # 0.169 mgoSim(go1, go2, semData = d.MF) # 0.169 mgoSim(go1, go2, semData = d.CC) # 0.169

As of now, are goSim and mgoSim (with Wang's method) calculating GO similarities regardless of sub-ontology restriction? Does specifying subontologies make any difference?

Thank you

GuangchuangYu commented 7 years ago

Yes, Wang's method indeed ignore the GO sub-ontology parameter. The ont parameter is reserve for supporting different ontology (e.g., GO, DO).

Wang' s method only consider the DAG graph structure and sub-ontologies of GO are encoded in a single graph in GO.db. This is the reason why ont parameter was ignored using Wang's method in GOSemSim.

We would expect similarities of GO IDs from different sub-ontologies to be 0, and you can verify it with combine=NULL passing to mgoSim.

> mgoSim(go1, go2, semData = d.CC, combine=NULL) # 0.169
           GO:0000724 GO:0006281 GO:0030915
GO:0016491      0.000       0.00          0
GO:0055114      0.266       0.29          0
GRamstein commented 7 years ago

Thank you for your insight on the issue. This makes perfect sense!