rdkit / knime-rdkit

The RDKit nodes for the KNIME Analytics Platform
25 stars 14 forks source link

Knime RDKit nodes "RDKit Canon Smiles" and "RDKit to InChI" are crashing Knime 4.4.4 #107

Closed ddsneo4j closed 2 years ago

ddsneo4j commented 2 years ago

Discussed in https://github.com/rdkit/rdkit/discussions/5227

Originally posted by **ddsneo4j** April 22, 2022 Hi, unfortunately, the RDKit nodes "RDKit Canon Smiles" and "RDKit to InChI" are crashing Knime 4.4.4 - see attached Knime workflow and input structure. Could this bug please be fixed, i.e. Knime should not be crashed by these nodes because of a structure where canonical smiles and InChI keys cannot be created for? Thanks for your effort in advance. Best regards, Dan [RdKit.zip](https://github.com/rdkit/rdkit/files/8539932/RdKit.zip)
ddsneo4j commented 2 years ago

Thanks, Greg for having transferred this issue to the right place. I was told by Knime support that they also approched you regarding this issue a couple of months ago. Hopefully, this can be fixed soon as crashing Knime blocks my current work.

Thanks, Dan

ddsneo4j commented 2 years ago

Any update on this issue?

Thanks, Dan

greglandrum commented 2 years ago

@manuelschwarze : I can reproduce this in KNIME on Windows, but it looks like the problem is not coming from the underlying C++ code. I'm able to read in the example molecule and generate SMILES from Python without problems. I'm travelling and don't have an eclipse environment installed, could you please take a look at this and figure out which call is causing the crash from within the KNIME nodes?

greglandrum commented 2 years ago

@manuelschwarze I did an experiment where I try loading that molecule using the RDKit java wrappers in a java snippet node (workflow attached) and can reproduce the crash there. So, unfortunately for me, it looks like we can blame the RDKit backend (or at least the java wrappers) for this. I will see if I can figure it out. Github107_simplified.zip

If you have time, it would be interesting to know whether or not you can execute the "This is the one that crashes" Java snippet node on your windows dev env.

manuelschwarze commented 2 years ago

@greglandrum I ran it with KNIME 4.3 on my Windows 10 system, and it crashed as expected. I also had reproduced the crash before already. Only native code integrated into Java can bring Java immediately down, like RDKit or a gtk issue on Linux. I hope you find the culprit in RDKit or related to Java wrappers (again it would be something native code specific in that case). Thanks for taking the time!

manuelschwarze commented 2 years ago

@greglandrum Your Java Snippet node had two call. Only the second is actually crashing KNIME: mol.MolToSmiles() I tried to run it with all combinations of 2 boolean parameters mol.MolToSmiles(bool, bool), and for all of them it fails as well. Maybe it help as a hint. Not much, though...

greglandrum commented 2 years ago

Yeah, sorry, I had done that test too

greglandrum commented 2 years ago

@ddsneo4j @manuelschwarze after a bunch of poking around and testing, I believe that I have found the problem, which is a JVM problem, and a fix. To "fix" the problem: edit your knime.ini file and add this line somewhere:

-Xss1024m

Here's what's going on: When the RDKit generates SMILES it calls a function which recurses over all of the atoms in the molecule (worst case, if the molecule has either no rings or very few rings). This type of deep recursion requires a lot of stack space and it seems like the default stack size for the JVM on Windows is small enough that the recursion on this molecule exhausts the available stack space. The solution is to increase the stack size for the JVM used in KNIME (which is what the line above does).

ddsneo4j commented 2 years ago

Thanks, Greg. I have made the requested addition to the knime.ini file and started my workflow. I'll give you an update about the outcome next week.

Thanks and best regards, Dan

ddsneo4j commented 2 years ago

Hi, I can confirm that the workflow ran to completion, i.e. no crashing anymore. Thanks a lot!

Best, Dan

greglandrum commented 2 years ago

Hi, I can confirm that the workflow ran to completion, i.e. no crashing anymore. Thanks a lot!

Glad to hear it!