ssayols / rrvgo

Reduce + Visualize Gene Ontology
GNU General Public License v3.0
21 stars 3 forks source link

Bug fix for obsolete GO terms #6

Open RHReynolds opened 3 years ago

RHReynolds commented 3 years ago

Hi,

I recently ran into the following error when using the reduceSimMatrix() function:

Error in FUN(X[[i]], ...) : 
  trying to get slot "Term" from an object of a basic class ("NULL") with no slots

I was able to trace this back to getGoTerm(), one of the underlying functions within reduceSimMatrix(). The problem seems to occur with GO terms (e.g. c("GO:0000988", "GO:0000989")) that have become obsolete in newer versions of the GO database. Thus, when GO.db::GOTERM[[x]] is run, NULL is returned, and @Term cannot be applied. I wondered if you could perhaps add some sort of fix to this, such that if NULL is returned, an NA is assigned to that input GO id? I've added in an example of what this could look like below:

# First and second terms are obsolete, while third should return a term
x <- c("GO:0000988", "GO:0000989", "GO:0003723") 

sapply(x, function(x) {
  go <- GO.db::GOTERM[[x]]
  if(is.null(go)){
    term <- NA 
  } else{
    term <- go@Term
  }
  return(term)
  })

Thanks very much! Regina

ssayols commented 3 years ago

Hi Regina, good point, thank you for reporting it. I just issued the fix you suggested. I'll roll it out now in Bioconductor, and will be available in a couple of days. In the meantime you can reinstall the package from Github with devtools::install_github("https://github.com/ssayols/rrvgo").

Btw, if I understand correctly, calculateSimMatrix() will return a similarity score for obsolete terms, but reduceSimMatrix() will not find them in GO.db?

cheers, Sergi

RHReynolds commented 3 years ago

Hi Sergi,

Thanks for implementing the fix so quickly :)

That's a really good question. So I haven't actually been using calculateSimMatrix() to return similarity scores. I've been using GOSemSim::mgoSim(), which I believe is very similar to goSim(), but just returns the semantic similarity in a matrix format. Looking at one of the obsolete terms I mentioned above ("GO:0000988"), it looks like GOSemSim simply returns a score of 1 when comparing the term to itself, and a score of 0 when comparing to any other GO terms.

So yes, as you said, it looks like GOSemSim is computing a similarity score for the obsolete term, but the term is not found in GO.db.

Best, Regina

ssayols commented 3 years ago

okay thanks for the feedback.

The way we calculate the similarity matrix in rrvgo doesn't differ much from GOSemSim, and therefore we may create the same inconsistency. I'll investigate it further.