egonw / bacting

Bacting is an open-source platform for chemo- and bioinformatics based on Bioclipse that defines a number of common domain objects and wraps common functionality, providing a toolkit independent, scriptable solution to handle data from the life sciences.
Other
13 stars 5 forks source link

CDKManager.loadMolecule with Scyjava does not work #58

Closed kozo2 closed 2 years ago

kozo2 commented 3 years ago

Bacting API method with unexpected output

loadMolecule did not work when importing CDKManager with scyjava.

image

Expected Output

kegg_sdf will be a data object for the given input

Actual Output

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-14-9f73bc61576b> in <module>()
----> 1 kegg_sdf = cdk.loadMolecule("./kegg.sdf")

TypeError: No matching overloads found for net.bioclipse.managers.CDKManager.loadMolecule(str), options are:
    public net.bioclipse.cdk.domain.ICDKMolecule net.bioclipse.managers.CDKManager.loadMolecule(java.io.InputStream,org.openscience.cdk.io.formats.IChemFormat) throws net.bioclipse.core.business.BioclipseException,java.io.IOException

Additional context

You can reproduce the error with https://colab.research.google.com/drive/1RwXBVgClncDwascTbhhsFkuzPmgFl6ar?usp=sharing

egonw commented 3 years ago

Okay, I think this one is a bit more complicated. The same str/String issue likely, but not sure that it's the only thing to solve here. I also think both need a ScyJava example in the documentation, somewhere.

egonw commented 2 years ago

I ported the method you need, which will be part of the next release:

from scyjava import config, jimport

config.add_endpoints('io.github.egonw.bacting:managers-cdk:0.0.24')

workspaceRoot = "."
cdkClass = jimport("net.bioclipse.managers.CDKManager")
cdk = cdkClass(workspaceRoot)

heptane = cdk.loadMolecule("/Test/heptane.mol")

print(heptane)
egonw commented 2 years ago

I am going to release 0.0.24 now.

egonw commented 2 years ago

done.

kozo2 commented 2 years ago

@egonw Thanks. However, there seems to be a problem with the handling of the loadMolecule argument file PATH string. You can check it with https://colab.research.google.com/drive/1RwXBVgClncDwascTbhhsFkuzPmgFl6ar?usp=sharing

from scyjava import config, jimport
config.add_endpoints('io.github.egonw.bacting:managers-cdk:0.0.24')

workspaceRoot = "."
cdkClass = jimport("net.bioclipse.managers.CDKManager")
cdk = cdkClass(workspaceRoot)
# kegg_sdf = cdk.loadMolecule("./kegg.sdf") #this does not work
# kegg_sdf = cdk.loadMolecule("/content/kegg.sdf") #this also does not work
kegg_sdf = cdk.loadMolecule("/kegg.sdf") # this works but is not appropriate
egonw commented 2 years ago

Yes, this is a bit of Bioclipse legacy (which inherits it from Eclipse)... everything is in a Project in the workspace. On disk this is just a folder.

So, your workspaceRoot = ".". In that folder make a subfolder "KEGG" and copy the kegg.sdf file into that folder. Then:

kegg_sdf = cdk.loadMolecule("/KEGG/kegg.sdf")
kozo2 commented 2 years ago

Thank you for the information. I understand why you always put files under a certain directory.