MolecularAI / REINVENT4

AI molecular design tool for de novo design, scaffold hopping, R-group replacement, linker design and molecule optimization.
Apache License 2.0
359 stars 89 forks source link

Fixed DockStream issue: Outputting 0 Docking score values for all mol… #131

Closed raweru closed 2 months ago

raweru commented 2 months ago

Description: I was trying to implement the DockStream RL workflow from one of the notebooks on my HPC Linux system with the Openeye Omega & Hybrid programs but kept getting 0.000 docking scores for all molecules which seemed a bit odd (this was before I read that you don't support DockStream anymore).

Anyway, I dove deep into the DockStream-Reinvent source code (nicely written btw) to find the fault as I tested and Dockstream with OE programs worked fine executed outside of Reinvent.

There is a tiny issue of how the smiles values are provided within the subprocess command inside comp_dockstream.py that Reinvent seems to not like very much. Works fine after the fix though!

I can also write a workflow notebook for the OE-DockStream combo if it would be useful to anyone, shame about DockStream though I just started to like it! Will start studying up on maize when I get a chance

halx commented 2 months ago

There is indeed no reason to quote the string as we are not using the shell (which is, unfortunately and falsly, often done to spawn external processes) and so this is really a bug. Many thanks for fixing this. The caveat on Linux, however, is that a single argument can only have a rather limited number of characters (128K-1, I believe) so there should be a check for this and the run probably terminated. In practice, with a typical batch size of maybe 100-200, that should not be an issue though.

A notebook is always welcome.

raweru commented 2 months ago

Yeah no problem! I will create a new PR for the notebook.

halx commented 2 months ago

Many thanks, but there is no need to open a new PR for new commits. It just confuses matters.