nanxstats / Rcpi

💊 Molecular informatics toolkit with integration of bioinformatics and cheminformatics tools for drug discovery
https://nanx.me/Rcpi/
Artistic License 2.0
36 stars 12 forks source link

Rcpi extractDrugAIO take long time #1

Closed liuxianghui closed 7 years ago

liuxianghui commented 7 years ago

Dear Nan: I am trying with Rcpi to calculate descriptors for my 160 small molecules. However, my R seems like hanging there. I checked your manual and I did use the 3D structures. Your test dataset of OptAA3d.sdf seems no problem. Shall I do further clean up of structures? I prepared my sd file from ChemFinder and convert to 3D using chemAxon. Please kindly suggest, Xiannghui

liuxianghui commented 7 years ago

data = extractDrugAIO(sdf, warn = FALSE) I ran this and it finally ends with the following error. Error in .jcall(dval, "Ljava/lang/Exception;", "getException") : java.lang.OutOfMemoryError: Java heap space

nanxstats commented 7 years ago

Hi XiangHui,

-- thanks for the feedback. Some of the descriptors may require a considerable amount of memory and CPU time to compute, especially when the molecules have 3D coordinate information. The error here seems to be caused by not enough heap size for JVM.

One possible solution could be increasing the heap size settings manually (and maybe also use a computer with larger size of physical memory): http://stackoverflow.com/questions/21937640/

If that doesn't work well, another workaround could be manually splitting the 160 structures into several mini-batches (e.g. 8 SDF files, each contains 20 molecules), evaluate them separately, then concatenate the (8) descriptor matrices together in the end.

Best, -Nan