Closed dionjwa closed 6 years ago
Sorry for the poor documentation of dependencies. I would bet that error is appearing because Keras now uses the Tensorflow backend by default (which did not use to be the case) and Tensorflow tensors do not have a property ndim. I haven't worked with Keras+Tensorflow myself, but perhaps replacing x.ndim
with one of the expressions mentioned here could work. As a short-term solution, I would recommend trying to switch your Keras backend to Theano - hopefully that resolves it.
To get the shape manipulations to work out with lambda expressions, I kind of ruined the flexibility of Keras to work with both Theano and Tensorflow backends. You'll notice that use of Theano is hardcoded in the architecture.
I had same problem. But I think if you change backend from tensorflow to theano, it will work well. Editing $HOME/.keras/keras.json worked fine with me. FYI https://keras.io/backend/
Thanks for the help! That solved that problem. Now I'm getting this:
Total params: 144,001
Trainable params: 144,001
Non-trainable params: 0
__________________________________________________________________________________________________
Enter SMILES of reactants: C#N [Na+].[Cl-]
Could not parse!
Enter SMILES of reactants: C.O=O>O=[O+]-[O-]>
O=C=O.OCould not parse!
Enter SMILES of reactants: C.O=O O=[O+]-[O-]
Could not parse!
Enter SMILES of reactants: C1=CC=CN=C1
Number of reactant atoms: 6
Reactants w/o map: c1ccncc1
Reactants w/ map: [cH:1]1[cH:2][cH:3][cH:4][n:5][cH:6]1
0%| | 0/1689 [00:00<?, ?it/s]Outcome SMILES: C1=CNCCC1
Do you have the custom RDKit version installed? Maybe not...
Traceback (most recent call last):
File "ochem_predict_nn/scripts/lowe_interactive_predict.py", line 253, in <module>
candidate_list = reactants_to_candidate_edits(reactants)
File "ochem_predict_nn/scripts/lowe_interactive_predict.py", line 62, in reactants_to_candidate_edits
raise(e)
KeyError: 'molAtomMapNumber'
What is the "custom RDKit version"? Does that have to do with the error I finally see?
It does, yes. This code relies on a version of RDKit build from source here: https://github.com/connorcoley/RDKit
The modification that has been made is in the RunReactants code. In the standard release of RDKit, the atom map numbers of reactant atoms belonging to the reaction center (i.e., that match atoms in the templates) are removed in the products. My fork copies over those values into the new product atoms as a new atom property field. The result is that we can reconstruct a full atom-to-atom mapping for the candidate reaction.
If your use-case doesn't let you build RDKit from source, there is a workaround but it would require more extensive modifications to the code here. Instead of using atom map numbers (and the modified RDKit) to recover the atom-to-atom mapping, you can cheat a little by using isotope numbers instead. The standard RDKit release will preserve isotopes as long as the reaction SMARTS do not specify isotope numbers. So instead of taking the reactants input, assigning atom mapping, running the reaction, recovering atom-to-atom mapping through atom map numbers, and then converting to the edit-based representation, what you would do is: assign unique isotope numbers to each reactant atom, run the reaction, copy isotope numbers to atom map numbers for reactants and products, strip isotope numbers from reactants and products, and use the now-recovered atom-to-atom mapping to convert to the edit-based representation.
I made conda package from modified version of RDKit and installed it by using conda command. I think it is easy way to build virtual environment without installing the rdkit directly to system environment directly. I am happy to push the package if it will be some ones help.
@connorcoley I'm getting closer.
I have build your fork of RDKit, but I'm getting a different error now:
Using Theano backend.
Traceback (most recent call last):
File "ochem_predict_nn/scripts/lowe_interactive_predict.py", line 12, in <module>
import ochem_predict_nn.main.transformer as transformer
File "/chem/ochem_predict_nn/main/transformer.py", line 3, in <module>
import rdkit.Chem as Chem
File "/rdkit/rdkit/Chem/__init__.py", line 29, in <module>
from rdkit.Chem.inchi import *
ImportError: No module named inchi
The import line is here:
from rdkit.Chem.rdmolfiles import *
from rdkit.Chem.rdmolops import *
from rdkit.Chem.inchi import *
But the generated files look like:
...
rdfiltercatalog.so
rdfragcatalog.so
rdinchi.so
rdmolfiles.so
...
Am I on the right brach of https://github.com/connorcoley/rdkit
? I notice there are a lot.
If you want to see what I'm trying to do, you can take a look at my fork:
https://github.com/dionjwa/ochem_predict_nn/tree/dockerize
At the bottom of the README.md
I've added instructions how to get the docker-compose
stack working. You can see the install steps in the Dockerfile
.
I'm just trying to get a reproducible docker build and run a test.
If I change
from rdkit.Chem.inchi import *
to
from rdkit.Chem.rdinchi import *
I get further, but given that others haven't hit this, I suspect I'm on the wrong path here. So many complex dependencies :-)
Once I make the above change things work!
@dionjwa , that import error is very strange, since that's all within RDKit. When I forked it, I only changed a couple of files (pertaining to running reactions, certainly no module renaming or changing init.py files). I'm glad you've gotten it to work! I appreciate you keeping your fork open source so others can benefit from it. One of these days, I should learn how to use Docker...
@dionjwa Any pointers on getting this docker image to build (like the legacy continuumio/anaconda
version)? I've tried to get it running on an EC2 but failing.
Hi!
I'm attempting to use this work as part of another project to make predictions:
https://github.com/dionjwa/ochem_predict_nn/tree/dockerize
Initially I'd like to just get this work to work in docker, so at least we have a reproducible build that works. There's no instructions on e.g. versions of dependencies, so I had to make a bunch of guesses, but I've got it mostly working, however, it still gives various errors: