CDK-R / cdkr

Integrating R and the CDK
https://cdk-r.github.io/cdkr/
42 stars 27 forks source link

NPE if SDF file has R# atom #7

Closed sneumann closed 12 years ago

sneumann commented 12 years ago

Hi,

I am trying to iterate over ChEBI using the code below on a current rcdk git snapshot, and want to calculate fingerprints. The iteration fails as soon as rcdk hits a compound with an R# "atom", e.g. CHEBI:15489 because the hasNext(moliter) fails with an NPE:

Error in .jcall(sreader, "Z", "hasNext") : java.lang.NullPointerException

I have no problem if the molecule is a NULL, but iteration should be able to continue to the end of the file.

Yours, Steffen

P.S. That github Markdown for code looks cool!

library(rcdk)
sessionInfo()

chebifile <- "ChEBI_complete.sdf"

# iterate over a large file
moliter <- iload.molecules(chebifile, type="sdf")
i <- 1
chebifp <- c(new("fingerprint"))

while(hasNext(moliter)) {
    mol <- nextElem(moliter)
}

> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C                 LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] rcdk_3.1.5        iterators_1.0.3   png_0.1-3         fingerprint_3.4.6
[5] rcdklibs_1.4.5    rJava_0.9-2      

rajarshi commented 12 years ago

thanks for pointing this out. I've uploaded new versions to Guthub but there's still some bugs somewhere

On Wed, Feb 22, 2012 at 2:46 AM, sneumann < reply@reply.github.com

wrote:

Hi,

I am trying to iterate over ChEBI using the code below on a current rcdk git snapshot, and want to calculate fingerprints. The iteration fails as soon as rcdk hits a compound with an R# "atom", e.g. CHEBI:15489 because the hasNext(moliter) fails with an NPE.

I have no problem if the molecule is a NULL, but iteration should be able to continue to the end of the file.

Yours, Steffen

P.S. That github Markdown for code looks cool!

library(rcdk)
sessionInfo()

chebifile <- "ChEBI_complete.sdf"

# iterate over a large file
moliter <- iload.molecules(chebifile, type="sdf")
i <- 1
chebifp <- c(new("fingerprint"))

while(hasNext(moliter)) {
   mol <- nextElem(moliter)
}

> sessionInfo()
R version 2.14.1 (2011-12-22)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] rcdk_3.1.5        iterators_1.0.3   png_0.1-3         fingerprint_3.4.6
[5] rcdklibs_1.4.5    rJava_0.9-2


Reply to this email directly or view it on GitHub: https://github.com/rajarshi/cdkr/issues/7

Rajarshi Guha NIH Chemical Genomics Center

sneumann commented 12 years ago

Hi, yes, issue solved for me, it now skips the problematics compounds.

Thanks for that super-quick response, Yours, Steffen