AttributeError: 'Tensor' object has no attribute 'ndim'

dionjwa commented 6 years ago

Hi!

I'm attempting to use this work as part of another project to make predictions:

https://github.com/dionjwa/ochem_predict_nn/tree/dockerize

Initially I'd like to just get this work to work in docker, so at least we have a reproducible build that works. There's no instructions on e.g. versions of dependencies, so I had to make a bunch of guesses, but I've got it mostly working, however, it still gives various errors:


Starting ochempredictnn_db_1 ... done
/opt/conda/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Out of 140284 database templates,
Loaded 1689 templates
Sorted by count, descending
Arguments loaded from saved model:
{u'Nhf': 50, u'optimizer': u'adadelta', u'baseline': 0, u'inner_act': u'tanh', u'hybrid': 0, u'nb_epoch': 200, u'Nc': 1000, u'masktest': 0, u'batch_size': 1, u'retrain': False, u'fold': 1, u'l2': 0.0, u'tag': u'10xrn_demo', u'lr': 0.01, u'Nh1': 200, u'Nh3': 50, u'Nh2': 100, u'test': False, u'data_tag': u'ochem_predict_nn/data/lowe_data_edits/lowe'}
/opt/conda/lib/python2.7/site-packages/keras/engine/topology.py:1269: UserWarning: The `Merge` layer is deprecated and will be removed after 08/2017. Use instead layers from `keras.layers.merge`, e.g. `add`, `concatenate`, etc.
  return cls(**config)
Traceback (most recent call last):
  File "ochem_predict_nn/scripts/lowe_interactive_predict.py", line 228, in <module>
    BASELINE_MODEL = BASELINE_MODEL,
  File "/chem/ochem_predict_nn/main/score_candidates_from_edits_compact.py", line 69, in build
    h_lost_r    = Lambda(dynamic_reshaper, output_shape = dynamic_reshaper_shape, name = "flatten_H_lost")(h_lost)
  File "/opt/conda/lib/python2.7/site-packages/keras/engine/topology.py", line 617, in __call__
    output = self.call(inputs, **kwargs)
  File "/opt/conda/lib/python2.7/site-packages/keras/layers/core.py", line 663, in call
    return self.function(inputs, **arguments)
  File "/chem/ochem_predict_nn/main/score_candidates_from_edits_compact.py", line 66, in <lambda>
    dynamic_reshaper       = lambda x: T.reshape(x, (x.shape[0] * x.shape[1] * x.shape[2], x.shape[3]), ndim  = x.ndim-2)
AttributeError: 'Tensor' object has no attribute 'ndim'```

Are these errors due to some problem in the dependencies, or with a problem with the arguments, or something else?

Thanks!

connorcoley commented 6 years ago

Sorry for the poor documentation of dependencies. I would bet that error is appearing because Keras now uses the Tensorflow backend by default (which did not use to be the case) and Tensorflow tensors do not have a property ndim. I haven't worked with Keras+Tensorflow myself, but perhaps replacing x.ndim with one of the expressions mentioned here could work. As a short-term solution, I would recommend trying to switch your Keras backend to Theano - hopefully that resolves it.

To get the shape manipulations to work out with lambda expressions, I kind of ruined the flexibility of Keras to work with both Theano and Tensorflow backends. You'll notice that use of Theano is hardcoded in the architecture.

iwatobipen commented 6 years ago

I had same problem. But I think if you change backend from tensorflow to theano, it will work well. Editing $HOME/.keras/keras.json worked fine with me. FYI https://keras.io/backend/

dionjwa commented 6 years ago

Thanks for the help! That solved that problem. Now I'm getting this:

Total params: 144,001
Trainable params: 144,001
Non-trainable params: 0
__________________________________________________________________________________________________
Enter SMILES of reactants: C#N [Na+].[Cl-]
Could not parse!
Enter SMILES of reactants: C.O=O>O=[O+]-[O-]>
O=C=O.OCould not parse!
Enter SMILES of reactants: C.O=O O=[O+]-[O-]
Could not parse!
Enter SMILES of reactants: C1=CC=CN=C1
Number of reactant atoms: 6
Reactants w/o map: c1ccncc1
Reactants w/ map: [cH:1]1[cH:2][cH:3][cH:4][n:5][cH:6]1
  0%|                                                                                                                                                                                                                                                  | 0/1689 [00:00<?, ?it/s]Outcome SMILES: C1=CNCCC1
Do you have the custom RDKit version installed? Maybe not...

Traceback (most recent call last):
  File "ochem_predict_nn/scripts/lowe_interactive_predict.py", line 253, in <module>
    candidate_list = reactants_to_candidate_edits(reactants)
  File "ochem_predict_nn/scripts/lowe_interactive_predict.py", line 62, in reactants_to_candidate_edits
    raise(e)
KeyError: 'molAtomMapNumber'

What is the "custom RDKit version"? Does that have to do with the error I finally see?

connorcoley commented 6 years ago

It does, yes. This code relies on a version of RDKit build from source here: https://github.com/connorcoley/RDKit

The modification that has been made is in the RunReactants code. In the standard release of RDKit, the atom map numbers of reactant atoms belonging to the reaction center (i.e., that match atoms in the templates) are removed in the products. My fork copies over those values into the new product atoms as a new atom property field. The result is that we can reconstruct a full atom-to-atom mapping for the candidate reaction.

If your use-case doesn't let you build RDKit from source, there is a workaround but it would require more extensive modifications to the code here. Instead of using atom map numbers (and the modified RDKit) to recover the atom-to-atom mapping, you can cheat a little by using isotope numbers instead. The standard RDKit release will preserve isotopes as long as the reaction SMARTS do not specify isotope numbers. So instead of taking the reactants input, assigning atom mapping, running the reaction, recovering atom-to-atom mapping through atom map numbers, and then converting to the edit-based representation, what you would do is: assign unique isotope numbers to each reactant atom, run the reaction, copy isotope numbers to atom map numbers for reactants and products, strip isotope numbers from reactants and products, and use the now-recovered atom-to-atom mapping to convert to the edit-based representation.

iwatobipen commented 6 years ago

I made conda package from modified version of RDKit and installed it by using conda command. I think it is easy way to build virtual environment without installing the rdkit directly to system environment directly. I am happy to push the package if it will be some ones help.

dionjwa commented 6 years ago

@connorcoley I'm getting closer.

I have build your fork of RDKit, but I'm getting a different error now:

Using Theano backend.
Traceback (most recent call last):
  File "ochem_predict_nn/scripts/lowe_interactive_predict.py", line 12, in <module>
    import ochem_predict_nn.main.transformer as transformer
  File "/chem/ochem_predict_nn/main/transformer.py", line 3, in <module>
    import rdkit.Chem as Chem
  File "/rdkit/rdkit/Chem/__init__.py", line 29, in <module>
    from rdkit.Chem.inchi import *
ImportError: No module named inchi

The import line is here:

from rdkit.Chem.rdmolfiles import *
from rdkit.Chem.rdmolops import *
from rdkit.Chem.inchi import *

But the generated files look like:

...
rdfiltercatalog.so
rdfragcatalog.so
rdinchi.so
rdmolfiles.so
...

Am I on the right brach of https://github.com/connorcoley/rdkit? I notice there are a lot.

If you want to see what I'm trying to do, you can take a look at my fork:

https://github.com/dionjwa/ochem_predict_nn/tree/dockerize

At the bottom of the README.md I've added instructions how to get the docker-compose stack working. You can see the install steps in the Dockerfile.

I'm just trying to get a reproducible docker build and run a test.

dionjwa commented 6 years ago

If I change

from rdkit.Chem.inchi import *

to

from rdkit.Chem.rdinchi import *

I get further, but given that others haven't hit this, I suspect I'm on the wrong path here. So many complex dependencies :-)

dionjwa commented 6 years ago

Once I make the above change things work!

connorcoley commented 6 years ago

@dionjwa , that import error is very strange, since that's all within RDKit. When I forked it, I only changed a couple of files (pertaining to running reactions, certainly no module renaming or changing init.py files). I'm glad you've gotten it to work! I appreciate you keeping your fork open source so others can benefit from it. One of these days, I should learn how to use Docker...

Benjaki2 commented 10 months ago

@dionjwa Any pointers on getting this docker image to build (like the legacy continuumio/anaconda version)? I've tried to get it running on an EC2 but failing.

connorcoley / ochem_predict_nn

AttributeError: 'Tensor' object has no attribute 'ndim' #3