CDK-R / cdkr

Integrating R and the CDK
https://cdk-r.github.io/cdkr/
42 stars 27 forks source link

fingerprint::distance #48

Closed varungiri closed 7 years ago

varungiri commented 7 years ago

First execution returns the distance, second time around it generates segmentation error:

library('rcdk', 'fingerprint') a <- parse.smiles('CCC') b <- parse.smiles('CCCO') af <- get.fingerprint(a[[1]]) bf <- get.fingerprint(b[[1]]) fingerprint::distance(af, bf) [1] 0.4285714 fingerprint::distance(af, bf) Segmentation fault (core dumped)

This happens even if I use a new set of feature vectors.

zachcp commented 7 years ago

Hi @varungiri,

Can you replicate the error and provide the output of sessionInfo() please? I can use your code fine.

Thanks, zach cp

varungiri commented 7 years ago

library(rcdk, fingerprint) Loading required package: fingerprint a <- get.fingerprint(parse.smiles('CCCO')[[1]]) b <- get.fingerprint(parse.smiles('CCCC')[[1]]) distance(a,b) [1] 0.375 sessionInfo() R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS

Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=de_DE.UTF-8
[9] LC_ADDRESS=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] rcdk_3.3.8 fingerprint_3.5.4

loaded via a namespace (and not attached): [1] compiler_3.4.0 tools_3.4.0 parallel_3.4.0 rcdklibs_2.0
[5] iterators_1.0.8 itertools_0.1-3 rJava_0.9-8 png_0.1-7

distance(a,b) Segmentation fault (core dumped)

I will meanwhile try to reinstall R and check.

rajarshi commented 7 years ago

I can confirm it works for me as well.

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.12.5 (Sierra)

locale:
[1] C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rcdk_3.4.1     dplyr_0.4.3    plyr_1.8.3     reshape2_1.4.2 ggplot2_2.1.0  ncgcdb_1.0.0   ROracle_1.2-2  DBI_0.5-1     

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.4.5     png_0.1-7         assertthat_0.1    grid_3.3.1        R6_2.1.2          gtable_0.2.0      magrittr_1.5      scales_0.4.0      itertools_0.1-3   stringi_1.1.2    
[11] iterators_1.0.8   tools_3.3.1       stringr_1.1.0     fingerprint_3.5.3 munsell_0.4.3     rcdklibs_2.0      parallel_3.3.1    colorspace_1.2-6  rJava_0.9-8      
zachcp commented 7 years ago

@varungiri can you try installing rcdk off of Rajarshi's branch? I see you are using rcdk 3.3.8 but we are using 3.4.1 (now 3.4.2)

varungiri commented 7 years ago

I was in between also trying the cran version; but I get some error none the less.

a <- get.fingerprint(parse.smiles('CCCO')[[1]]) b <- get.fingerprint(parse.smiles('CCCOC')[[1]]) distance(a,b) [1] 0.7 sessionInfo() R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS

Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=de_DE.UTF-8 LC_NAME=de_DE.UTF-8
[9] LC_ADDRESS=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] rcdk_3.4.2 rcdklibs_2.0 rJava_0.9-8 devtools_1.13.2
[5] fingerprint_3.5.4

loaded via a namespace (and not attached): [1] png_0.1-7 digest_0.6.10 withr_1.0.2 R6_2.2.1
[5] git2r_0.18.0 httr_1.2.1 itertools_0.1-3 curl_2.6
[9] iterators_1.0.8 tools_3.4.0 parallel_3.4.0 compiler_3.4.0 [13] tcltk_3.4.0 memoise_1.1.0

distance(a,b) Segmentation fault (core dumped)

I tried installing R again, but in vain. May be I have some broken dependency somewhere. On other machines I tried it also worked for me. Any guess what I can check?

rajarshi commented 7 years ago

It is quite weird, I assume its happening in the C code. If you can send a core dump I can try poking around. But without being able to reproduce it, it's a bit difficult

On Thu, Jun 15, 2017 at 4:34 PM, varungiri notifications@github.com wrote:

I was in between also trying the cran version; but I get some error none the less.

a <- get.fingerprint(parse.smiles('CCCO')[[1]]) b <- get.fingerprint(parse.smiles('CCCOC')[[1]]) distance(a,b) [1] 0.7 sessionInfo() R version 3.4.0 (2017-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS

Matrix products: default BLAS: /usr/lib/openblas-base/libblas.so.3 LAPACK: /usr/lib/lapack/liblapack.so.3.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=de_DE.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=de_DE.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=de_DE.UTF-8 LC_NAME=de_DE.UTF-8 [9] LC_ADDRESS=de_DE.UTF-8 LC_TELEPHONE=de_DE.UTF-8 [11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=de_DE.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] rcdk_3.4.2 rcdklibs_2.0 rJava_0.9-8 devtools_1.13.2 [5] fingerprint_3.5.4

loaded via a namespace (and not attached): [1] png_0.1-7 digest_0.6.10 withr_1.0.2 R6_2.2.1 [5] git2r_0.18.0 httr_1.2.1 itertools_0.1-3 curl_2.6 [9] iterators_1.0.8 tools_3.4.0 parallel_3.4.0 compiler_3.4.0 [13] tcltk_3.4.0 memoise_1.1.0

distance(a,b) Segmentation fault (core dumped)

I tried installing R again, but in vain. May be I have some broken dependency somewhere. On other machines I tried it also worked for me. Any guess what I can check?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/rajarshi/cdkr/issues/48#issuecomment-308858277, or mute the thread https://github.com/notifications/unsubscribe-auth/AACGOVjz3qVrkCuSmVXDA2jLtxHGNG1mks5sEZVlgaJpZM4N6gat .

-- Rajarshi Guha | http://blog.rguha.net NIH Center for Advancing Translational Science

schymane commented 7 years ago

Works for me too :

library(fingerprint) a <- get.fingerprint(parse.smiles('CCCO')[[1]]) b <- get.fingerprint(parse.smiles('CCCC')[[1]]) distance(a,b) [1] 0.375 distance(a,b) [1] 0.375 distance(a,b) [1] 0.375 sessionInfo() R version 3.2.5 (2016-04-14) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 [4] LC_NUMERIC=C LC_TIME=English_Australia.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] fingerprint_3.5.4 rcdk_3.4.2 rcdklibs_2.0 rJava_0.9-8 RMassBankData_1.8.0 [6] RMassBank_2.3.1 Rcpp_0.12.9

varungiri commented 7 years ago

I checked calls to fp.sim.matrix run without any issue. Even multiple calls to it does not break anything. But the second call to distance still has the issue. --> Does not look like it is C code!

library(rcdk) Loading required package: rcdklibs Loading required package: rJava library(fingerprint) a <- get.fingerprint(parse.smiles('CCCC')[[1]]) b <- get.fingerprint(parse.smiles('CCCO')[[1]]) c <- get.fingerprint(parse.smiles('CCCN')[[1]]) fingerprint::fp.sim.matrix(c(a,b,c)) [,1] [,2] [,3] [1,] 1.000 0.375 0.375 [2,] 0.375 1.000 0.400 [3,] 0.375 0.400 1.000 fingerprint::fp.sim.matrix(c(a,b,c)) [,1] [,2] [,3] [1,] 1.000 0.375 0.375 [2,] 0.375 1.000 0.400 [3,] 0.375 0.400 1.000 fingerprint::fp.sim.matrix(c(a,b,c)) [,1] [,2] [,3] [1,] 1.000 0.375 0.375 [2,] 0.375 1.000 0.400 [3,] 0.375 0.400 1.000 fingerprint::fp.sim.matrix(c(a,b,c)) [,1] [,2] [,3] [1,] 1.000 0.375 0.375 [2,] 0.375 1.000 0.400 [3,] 0.375 0.400 1.000 fingerprint::fp.sim.matrix(c(a,b,c)) [,1] [,2] [,3] [1,] 1.000 0.375 0.375 [2,] 0.375 1.000 0.400 [3,] 0.375 0.400 1.000 fingerprint::distance(a,b) [1] 0.375 fingerprint::distance(a,c) Segmentation fault (core dumped)

rajarshi commented 7 years ago

Unfortunately, I still can't reproduce this

varungiri commented 7 years ago

Looks like a system specific error. I could not isolate the source of error.