YuLab-SMU / GOSemSim

:golf: GO-terms Semantic Similarity Measures
https://yulab-smu.top/biomedical-knowledge-mining-book/
58 stars 26 forks source link

Error that impacts BioCor #32

Closed llrs closed 3 years ago

llrs commented 3 years ago

Hi, I am the developer of BioCor that suggest and uses GOSemSim on the vignette.

On the latest Bioconductor (the soon to be released 3.12) I got an error on the vignette that I think is raises from GOSemSim. Find below a reproducible example on R 4.0.3:

gene_ids <- c("23098", "4843", "5431", "4710", "4287", "5217", "7321", "1207", 
  "9891", "27252", "56922", "1136", "51668", "5241", "54700", "43", 
  "11020", "5372", "7528", "79913", "2717", "6650", "9738", "3718", 
  "9827", "23586", "9148", "975", "84274", "80824", "8078", "10686", 
  "6152", "374291", "60482", "6509", "2582", "10560", "9194", "5228", 
  "25950", "10564", "26212", "8189", "94101", "8520", "968", "4301", 
  "2643", "51763", "23164", "254428", "29079", "56886", "9380", 
  "85465", "2247", "254013", "54509", "4123", "3801", "27043", 
  "10907", "84958", "26230", "9589", "908", "27147", "6129", "6749", 
  "2308", "7069", "3628", "5352", "1525", "58494", "9337", "7273", 
  "10670", "138199", "6750", "26958", "136227", "29115", "51005", 
  "7086", "285231", "4724", "9232", "1020", "2923", "124975", "55048", 
  "55867", "3516", "9677", "3965", "6940", "27258", "3866", "54811", 
  "5707", "201626", "7025", "10458", "127064", "126375", "9735", 
  "3852", "388567", "55615", "401541", "388552", "728", "5660", 
  "5336", "8337", "5004", "3833", "26063", "51750", "3690", "92335"
)
library("GOSemSim")
library("org.Hs.eg.db")
BP <- godata("org.Hs.eg.db", ont = "BP", computeIC = TRUE)
gsGO <- GOSemSim::mgeneSim(gene_ids, semData = BP, measure = "Resnik", verbose = TRUE)

The error as you can see below seems to be related to changes on Rcpp:

Error in infoContentMethod_cpp(ID1, ID2, .anc, IC, method, ont) : Expecting a string vector: [type=logical; required=STRSXP]. traceback() 6: stop(structure(list(message = "Expecting a string vector: [type=logical; required=STRSXP].", call = infoContentMethod_cpp(ID1, ID2, .anc, IC, method, ont), cppstack = NULL), class = c("Rcpp::not_compatible", "C++Error", "error", "condition"))) 5: infoContentMethod_cpp(ID1, ID2, .anc, IC, method, ont) 4: infoContentMethod(t1, t2, method = method, semData) 3: termSim(GO1, GO2, semData, method = measure) 2: mgoSim(uniqueGO, uniqueGO, semData, measure = measure, combine = NULL) 1: GOSemSim::mgeneSim(a, semData = BP, measure = "Resnik", verbose = TRUE)

Session Info ``` ─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────── setting value version R version 4.0.3 (2020-10-10) os Ubuntu 20.04.1 LTS system x86_64, linux-gnu ui RStudio language (EN) collate en_US.UTF-8 ctype en_US.UTF-8 tz Europe/Madrid date 2020-10-22 ─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────── ! package * version date lib source airway * 1.9.0 2020-04-30 [1] Bioconductor annotate 1.67.2 2020-10-15 [1] Bioconductor AnnotationDbi * 1.51.3 2020-07-25 [1] Bioconductor assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.3) backports 1.1.10 2020-09-15 [1] CRAN (R 4.0.3) base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.0.3) Biobase * 2.49.1 2020-09-03 [1] Bioconductor BiocGenerics * 0.35.4 2020-06-04 [1] Bioconductor BiocManager 1.30.10 2019-11-16 [1] CRAN (R 4.0.3) P BioCor * 1.13.1 2020-10-13 [?] Bioconductor BiocParallel 1.23.3 2020-10-17 [1] Bioconductor bit 4.0.4 2020-08-04 [1] CRAN (R 4.0.3) bit64 4.0.5 2020-08-30 [1] CRAN (R 4.0.3) bitops 1.0-6 2013-08-17 [1] CRAN (R 4.0.3) blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.3) boot * 1.3-25 2020-04-26 [1] CRAN (R 4.0.3) callr 3.5.1 2020-10-13 [1] CRAN (R 4.0.3) checkmate 2.0.0 2020-02-06 [1] CRAN (R 4.0.3) cli 2.1.0 2020-10-12 [1] CRAN (R 4.0.3) cluster 2.1.0 2019-06-19 [1] CRAN (R 4.0.3) codetools 0.2-16 2018-12-24 [1] CRAN (R 4.0.3) colorspace 1.4-1 2019-03-18 [1] CRAN (R 4.0.3) crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.3) data.table 1.13.2 2020-10-19 [1] CRAN (R 4.0.3) DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.3) DelayedArray 0.15.16 2020-10-06 [1] Bioconductor desc 1.2.0 2018-05-01 [1] CRAN (R 4.0.3) DESeq2 * 1.29.16 2020-10-13 [1] Bioconductor devtools * 2.3.2 2020-09-18 [1] CRAN (R 4.0.3) digest 0.6.26 2020-10-17 [1] CRAN (R 4.0.3) ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.3) evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.3) fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.3) foreign 0.8-80 2020-05-24 [1] CRAN (R 4.0.3) Formula * 1.2-4 2020-10-16 [1] CRAN (R 4.0.3) fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.3) genefilter 1.71.0 2020-04-27 [1] Bioconductor geneplotter 1.67.0 2020-04-27 [1] Bioconductor GenomeInfoDb * 1.25.11 2020-09-03 [1] Bioconductor GenomeInfoDbData 1.2.4 2020-10-22 [1] Bioconductor GenomicRanges * 1.41.6 2020-08-12 [1] Bioconductor ggplot2 * 3.3.2 2020-06-19 [1] CRAN (R 4.0.3) glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.3) GO.db 3.12.0 2020-10-22 [1] Bioconductor GOSemSim * 2.15.2 2020-09-04 [1] Bioconductor graph 1.67.1 2020-05-27 [1] Bioconductor gridExtra 2.3 2017-09-09 [1] CRAN (R 4.0.3) GSEABase 1.51.1 2020-05-29 [1] Bioconductor gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.3) highr 0.8 2019-03-20 [1] CRAN (R 4.0.3) Hmisc * 4.4-1 2020-08-10 [1] CRAN (R 4.0.3) htmlTable 2.1.0 2020-09-16 [1] CRAN (R 4.0.3) htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.3) htmlwidgets 1.5.2 2020-10-03 [1] CRAN (R 4.0.3) httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.3) IRanges * 2.23.10 2020-06-13 [1] Bioconductor jpeg 0.1-8.1 2019-10-24 [1] CRAN (R 4.0.3) knitr 1.30 2020-09-22 [1] CRAN (R 4.0.3) lattice * 0.20-41 2020-04-02 [1] CRAN (R 4.0.3) latticeExtra 0.6-29 2019-12-19 [1] CRAN (R 4.0.3) lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.3) locfit 1.5-9.4 2020-03-25 [1] CRAN (R 4.0.3) magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.3) Matrix 1.2-18 2019-11-27 [1] CRAN (R 4.0.3) MatrixGenerics * 1.1.8 2020-10-20 [1] Bioconductor matrixStats * 0.57.0 2020-09-25 [1] CRAN (R 4.0.3) memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.3) munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.3) nnet 7.3-14 2020-04-26 [1] CRAN (R 4.0.3) org.Hs.eg.db * 3.12.0 2020-10-22 [1] Bioconductor pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.3) pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 4.0.3) pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.3) pkgload 1.1.0 2020-05-29 [1] CRAN (R 4.0.3) png 0.1-7 2013-12-03 [1] CRAN (R 4.0.3) prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.3) processx 3.4.4 2020-09-03 [1] CRAN (R 4.0.3) ps 1.4.0 2020-10-07 [1] CRAN (R 4.0.3) R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.3) RColorBrewer 1.1-2 2014-12-07 [1] CRAN (R 4.0.3) Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.3) RCurl 1.98-1.2 2020-04-18 [1] CRAN (R 4.0.3) reactome.db 1.74.0 2020-10-22 [1] Bioconductor remotes 2.2.0 2020-07-21 [1] CRAN (R 4.0.3) rlang 0.4.8 2020-10-08 [1] CRAN (R 4.0.3) rmarkdown 2.5 2020-10-21 [1] CRAN (R 4.0.3) rpart 4.1-15 2019-04-12 [1] CRAN (R 4.0.3) rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.3) RSQLite 2.2.1 2020-09-30 [1] CRAN (R 4.0.3) rstudioapi 0.11 2020-02-07 [1] CRAN (R 4.0.3) S4Vectors * 0.27.14 2020-10-09 [1] Bioconductor scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.3) sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.3) stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.3) stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.3) SummarizedExperiment * 1.19.9 2020-10-01 [1] Bioconductor survival * 3.2-7 2020-09-28 [1] CRAN (R 4.0.3) targetscan.Hs.eg.db * 0.6.1 2020-10-22 [1] Bioconductor testthat * 2.3.2 2020-03-02 [1] CRAN (R 4.0.3) tibble 3.0.4 2020-10-12 [1] CRAN (R 4.0.3) usethis * 1.6.3 2020-09-17 [1] CRAN (R 4.0.3) vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.3) withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.3) xfun 0.18 2020-09-29 [1] CRAN (R 4.0.3) XML 3.99-0.5 2020-07-23 [1] CRAN (R 4.0.3) xtable 1.8-4 2019-04-21 [1] CRAN (R 4.0.3) XVector 0.29.3 2020-06-25 [1] Bioconductor yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.3) zlibbioc 1.35.0 2020-04-27 [1] Bioconductor ```

Many thanks!

GuangchuangYu commented 3 years ago

cannot reproduce your issue:

> BP <- godata("org.Hs.eg.db", ont = "BP", computeIC = TRUE)
preparing gene to GO mapping data...
preparing IC data...

> gsGO <- GOSemSim::mgeneSim(gene_ids, semData = BP, measure = "Resnik", verbose = TRUE)
  |======================================================================| 100%
> devtools::session_info()
─ Session info ───────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.0.3 (2020-10-10)
 os       Arch Linux                  
 system   x86_64, linux-gnu           
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       Asia/Chongqing              
 date     2020-10-23                  

─ Packages ───────────────────────────────────────────────────────────────────
 package       * version date       lib source        
 AnnotationDbi * 1.50.3  2020-07-25 [1] Bioconductor  
 assertthat      0.2.1   2019-03-21 [1] CRAN (R 4.0.0)
 backports       1.1.10  2020-09-15 [1] CRAN (R 4.0.2)
 Biobase       * 2.48.0  2020-04-27 [1] Bioconductor  
 BiocGenerics  * 0.34.0  2020-04-27 [1] Bioconductor  
 BiocManager     1.30.10 2019-11-16 [1] CRAN (R 4.0.0)
 bit             4.0.4   2020-08-04 [1] CRAN (R 4.0.2)
 bit64           4.0.5   2020-08-30 [1] CRAN (R 4.0.2)
 blob            1.2.1   2020-01-20 [1] CRAN (R 4.0.0)
 callr           3.5.1   2020-10-13 [1] CRAN (R 4.0.3)
 cli             2.1.0   2020-10-12 [1] CRAN (R 4.0.3)
 conflicted    * 1.0.4   2019-06-21 [1] CRAN (R 4.0.0)
 crayon          1.3.4   2017-09-16 [1] CRAN (R 4.0.0)
 DBI             1.1.0   2019-12-15 [1] CRAN (R 4.0.0)
 desc            1.2.0   2018-05-01 [1] CRAN (R 4.0.0)
 devtools        2.3.2   2020-09-18 [1] CRAN (R 4.0.2)
 digest          0.6.26  2020-10-17 [1] CRAN (R 4.0.3)
 ellipsis        0.3.1   2020-05-15 [1] CRAN (R 4.0.0)
 fansi           0.4.1   2020-01-08 [1] CRAN (R 4.0.0)
 fs              1.5.0   2020-07-31 [1] CRAN (R 4.0.2)
 glue            1.4.2   2020-08-27 [1] CRAN (R 4.0.2)
 GO.db           3.11.4  2020-06-15 [1] Bioconductor  
 GOSemSim      * 2.15.2  2020-10-23 [1] Bioconductor  
 IRanges       * 2.22.2  2020-05-21 [1] Bioconductor  
 magrittr      * 1.5     2014-11-22 [1] CRAN (R 4.0.0)
 memoise         1.1.0   2017-04-21 [1] CRAN (R 4.0.0)
 org.Hs.eg.db  * 3.11.4  2020-06-15 [1] Bioconductor  
 pkgbuild        1.1.0   2020-07-13 [1] CRAN (R 4.0.2)
 pkgconfig       2.0.3   2019-09-22 [1] CRAN (R 4.0.0)
 pkgload         1.1.0   2020-05-29 [1] CRAN (R 4.0.0)
 prettyunits     1.1.1   2020-01-24 [1] CRAN (R 4.0.0)
 processx        3.4.4   2020-09-03 [1] CRAN (R 4.0.2)
 ps              1.4.0   2020-10-07 [1] CRAN (R 4.0.2)
 R6              2.4.1   2019-11-12 [1] CRAN (R 4.0.0)
 Rcpp            1.0.5   2020-07-06 [1] CRAN (R 4.0.2)
 remotes         2.2.0   2020-07-21 [1] CRAN (R 4.0.2)
 rlang           0.4.8   2020-10-08 [1] CRAN (R 4.0.2)
 rprojroot       1.3-2   2018-01-03 [1] CRAN (R 4.0.0)
 RSQLite         2.2.1   2020-09-30 [1] CRAN (R 4.0.2)
 rvcheck       * 0.1.8   2020-04-26 [1] local         
 S4Vectors     * 0.26.1  2020-05-16 [1] Bioconductor  
 sessioninfo     1.1.1   2018-11-05 [1] CRAN (R 4.0.0)
 testthat        2.3.2   2020-03-02 [1] CRAN (R 4.0.0)
 usethis         1.6.3   2020-09-17 [1] CRAN (R 4.0.2)
 vctrs           0.3.4   2020-08-29 [1] CRAN (R 4.0.2)
 wget          * 0.0.1   2020-04-27 [1] local         
 withr           2.3.0   2020-09-22 [1] CRAN (R 4.0.2)
llrs commented 3 years ago

There are several differences on the session info. The most notable ones are the org.Hs.eg.db version you are using 3.11.4 while I am using 3.12.0, Biobase version is also higher (2.49.1 to 2.48) on my omputer as well as BiocGenerics (0.35.4 to 0.34.0).

Did you install them using BiocManager::install(version = "3.12") ?

GuangchuangYu commented 3 years ago

should be fixed in GOSemSim v >= 2.16.1

llrs commented 3 years ago

Great! Have this been imported to the master and RELEASE_3_12 branch? I ask because I want to reactivated a code chunk from a vignette on BioCor.
Would it be possible to add some tests to catch this kind of errors on the future earlier?

GuangchuangYu commented 3 years ago

already push to 3.12 branch.