jparkhill / TensorMol

Tensorflow + Molecules = TensorMol
http://blogs.nd.edu/parkhillgroup
GNU General Public License v3.0
271 stars 75 forks source link

Batchwise Evaluation of Chemspider Network #24

Closed Dom1L closed 6 years ago

Dom1L commented 6 years ago

Hello John, John and Kun,

I was trying out some more TensorMol and wanted to get the energy/forces of a molecule with multiple conformations using your ChemSpider Network. So instead of evaluating every conformation separately, I wanted to do a whole stack of conformations at once to be a bit faster.

I tried appending each conformation to a set and then using the EvalBPDirectEESet() procedure. The number of molecules inside the MSet() is correct ( I checked with a.mols ), but it keeps crashing with different errors every time. I used the GetChemSpider() function from one of your examples and also pulled the newest version from the repo.

Did I use the wrong function for evaluation or what is the best way to solve this problem?

Thanks for the help, Dominik

In [9]: a = MSet()

In [10]: for i in tqdm(coords):
    ...:     tmp = MSet()
    ...:     tmp.mols.append(Mol(atmnums, i))
    ...:     a.AppendSet(tmp)
    ...:     
In [15]: manager = GetChemSpiderNetwork(a, False)
loading the set...
finished loading the set..
TensorMolData.type: mol
TensorMolData.dig.name: ANI1_Sym_Direct
NMols in TensorMolData.set: 5
self.MaxNAtoms: 5
TensorMolData_BP.eles [1, 6]
self.HasGrad: True
Unpickling TFManager...
('Finding ', 'TensorMol.Containers.TensorMolData', 'TensorMolData_BP_Direct_EE_WithEle')
('Finding ', 'numpy.core.multiarray', '_reconstruct')
('Finding ', 'numpy', 'ndarray')
('Finding ', 'numpy', 'dtype')
('Finding ', 'numpy.core.multiarray', 'scalar')
('Finding ', 'TensorMol.Containers.DigestMol', 'MolDigester')
('Finding ', 'TensorMol.TFNetworks.TFMolInstanceDirect', 'MolInstance_DirectBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout')
('Finding ', 'tensorflow.python.framework.dtypes', 'DType')
('Finding ', 'TensorMol.ForceModifiers.Transformer', 'Transformer')
TFManager Metadata Loaded, Reviving Networks.
-- TensorMol, Tensorflow Manager Status--
Unpickling TFInstance...
('Finding ', 'numpy.core.multiarray', '_reconstruct')
('Finding ', 'numpy', 'ndarray')
('Finding ', 'numpy', 'dtype')
('Finding ', 'numpy.core.multiarray', 'scalar')
('Finding ', 'tensorflow.python.framework.dtypes', 'DType')
('Finding ', 'TensorMol.ForceModifiers.Transformer', 'Transformer')
('Finding ', 'TensorMol.Containers.TensorMolData', 'TensorMolData_BP_Direct_EE_WithEle')
('Finding ', 'TensorMol.Containers.DigestMol', 'MolDigester')
self.chk_file: ./networks/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100-chk-20
raised network: ./networks/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100
self.ScratchState None
self.ScratchPointer 0
MolInstance.inshape None MolInstance.outshape None
self.activation_function_type:  sigmoid_with_param
self.hidden1: 512  self.hidden2: 512  self.hidden3: 512
self.inshape: 768
self.elu_shift:  0.0030957710815
self.elu_alpha:  -0.00237590850807

In [19]: manager.EvalBPDirectEESet(a, PARAMS["AN1_r_Rc"], PARAMS["AN1_a_Rc"], PARAMS["EECutoffOff"])

self.batch_size: 5   self.MaxNAtoms: 5
loading the session..
INFO:tensorflow:Restoring parameters from ./networks/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100-chk-20
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-3fbc9b8d0d04> in <module>()
----> 1 manager.EvalBPDirectEESet(a, PARAMS["AN1_r_Rc"], PARAMS["AN1_a_Rc"], PARAMS["EECutoffOff"])

/shared/dominik/libraries/TensorMol/TensorMol/TFNetworks/TFMolManage.py in EvalBPDirectEESet(self, mol_set, Rr_cut, Ra_cut, Ree_cut)
   1250                 NLEE = NeighborListSet(xyzs, natom, False, False,  None)
   1251                 rad_eep = NLEE.buildPairs(Ree_cut)
-> 1252                 Etotal, Ebp, Ecc, mol_dipole, atom_charge, gradient  = self.Instances.evaluate([xyzs, Zs, dummy_energy, dummy_dipole, dummy_grads, rad_p, ang_t, rad_eep, 1.0/natom])
   1253                 return Etotal, Ebp, Ecc, mol_dipole, atom_charge, -JOULEPERHARTREE*gradient[0]
   1254 

/shared/dominik/libraries/TensorMol/TensorMol/TFNetworks/TFMolInstanceDirect.py in evaluate(self, batch_data)
   5705                         self.EvalPrepare()
   5706                 feed_dict=self.fill_feed_dict(batch_data+[PARAMS["AddEcc"]]+[np.ones(self.nlayer+1)])
-> 5707                 Etotal, Ebp, Ebp_atom, Ecc, Evdw, mol_dipole, atom_charge, gradient = self.sess.run([self.Etotal, self.Ebp, self.Ebp_atom, self.Ecc, self.Evdw, self.dipole, self.charge, self.gradient], feed_dict=feed_dict)
   5708                 #Etotal, Ebp, Ebp_atom, Ecc, Evdw, mol_dipole, atom_charge, gradient, bp_gradient, syms= self.sess.run([self.Etotal, self.Ebp, self.Ebp_atom, self.Ecc, self.Evdw, self.dipole, self.charge, self.gradient, self.bp_gradient, self.Scatter_Sym], feed_dict=feed_dict)
   5709                 #print ("Etotal:", Etotal, " bp_gradient", bp_gradient)

/shared/dominik/miniconda3/envs/tensormol/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
    887     try:
    888       result = self._run(None, fetches, feed_dict, options_ptr,
--> 889                          run_metadata_ptr)
    890       if run_metadata:
    891         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/shared/dominik/miniconda3/envs/tensormol/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1094                 'Cannot feed value of shape %r for Tensor %r, '
   1095                 'which has shape %r'
-> 1096                 % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape())))
   1097           if not self.graph.is_feedable(subfeed_t):
   1098             raise ValueError('Tensor %s may not be fed.' % subfeed_t)

ValueError: Cannot feed value of shape (100, 3) for Tensor u'Placeholder:0', which has shape '(?, 4)'
kunkinger commented 6 years ago

Hi Dominik,

I implemented the function for evaluating a set and added a sample in test_tensormol01.py. Let me know if you have any further questions.

-Best, Kun

On Thu, Mar 1, 2018 at 9:36 AM, Dominik Lemm notifications@github.com wrote:

Hello John, John and Kun,

I was trying out some more TensorMol and wanted to get the energy/forces of a molecule with multiple conformations using your ChemSpider Network. So instead of evaluating every conformation separately, I wanted to do a whole stack of conformations at once to be a bit faster.

I tried appending each conformation to a set and then using the EvalBPDirectEESet() procedure. The number of molecules inside the MSet() is correct ( I checked with a.mols ), but it keeps crashing with different errors every time. I used the GetChemSpider() function from one of your examples and also pulled the newest version from the repo.

Did I use the wrong function for evaluation or what is the best way to solve this problem?

Thanks for the help, Dominik

In [9]: a = MSet()

In [10]: for i in tqdm(coords): ...: tmp = MSet() ...: tmp.mols.append(Mol(atmnums, i)) ...: a.AppendSet(tmp) ...: In [15]: manager = GetChemSpiderNetwork(a, False) loading the set... finished loading the set.. TensorMolData.type: molTensorMolData.dig.name: ANI1_Sym_Direct NMols in TensorMolData.set: 5 self.MaxNAtoms: 5 TensorMolData_BP.eles [1, 6] self.HasGrad: True Unpickling TFManager... ('Finding ', 'TensorMol.Containers.TensorMolData', 'TensorMolData_BP_Direct_EE_WithEle') ('Finding ', 'numpy.core.multiarray', '_reconstruct') ('Finding ', 'numpy', 'ndarray') ('Finding ', 'numpy', 'dtype') ('Finding ', 'numpy.core.multiarray', 'scalar') ('Finding ', 'TensorMol.Containers.DigestMol', 'MolDigester') ('Finding ', 'TensorMol.TFNetworks.TFMolInstanceDirect', 'MolInstance_DirectBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout') ('Finding ', 'tensorflow.python.framework.dtypes', 'DType') ('Finding ', 'TensorMol.ForceModifiers.Transformer', 'Transformer') TFManager Metadata Loaded, Reviving Networks. -- TensorMol, Tensorflow Manager Status-- Unpickling TFInstance... ('Finding ', 'numpy.core.multiarray', '_reconstruct') ('Finding ', 'numpy', 'ndarray') ('Finding ', 'numpy', 'dtype') ('Finding ', 'numpy.core.multiarray', 'scalar') ('Finding ', 'tensorflow.python.framework.dtypes', 'DType') ('Finding ', 'TensorMol.ForceModifiers.Transformer', 'Transformer') ('Finding ', 'TensorMol.Containers.TensorMolData', 'TensorMolData_BP_Direct_EE_WithEle') ('Finding ', 'TensorMol.Containers.DigestMol', 'MolDigester') self.chk_file: ./networks/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100-chk-20 raised network: ./networks/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100 self.ScratchState None self.ScratchPointer 0 MolInstance.inshape None MolInstance.outshape None self.activation_function_type: sigmoid_with_param self.hidden1: 512 self.hidden2: 512 self.hidden3: 512 self.inshape: 768 self.elu_shift: 0.0030957710815 self.elu_alpha: -0.00237590850807

In [19]: manager.EvalBPDirectEESet(a, PARAMS["AN1_r_Rc"], PARAMS["AN1_a_Rc"], PARAMS["EECutoffOff"])

self.batch_size: 5 self.MaxNAtoms: 5 loading the session.. INFO:tensorflow:Restoring parameters from ./networks/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100/Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100-chk-20

ValueError Traceback (most recent call last)

in () ----> 1 manager.EvalBPDirectEESet(a, PARAMS["AN1_r_Rc"], PARAMS["AN1_a_Rc"], PARAMS["EECutoffOff"]) /shared/dominik/libraries/TensorMol/TensorMol/TFNetworks/TFMolManage.py in EvalBPDirectEESet(self, mol_set, Rr_cut, Ra_cut, Ree_cut) 1250 NLEE = NeighborListSet(xyzs, natom, False, False, None) 1251 rad_eep = NLEE.buildPairs(Ree_cut) -> 1252 Etotal, Ebp, Ecc, mol_dipole, atom_charge, gradient = self.Instances.evaluate([xyzs, Zs, dummy_energy, dummy_dipole, dummy_grads, rad_p, ang_t, rad_eep, 1.0/natom]) 1253 return Etotal, Ebp, Ecc, mol_dipole, atom_charge, -JOULEPERHARTREE*gradient[0] 1254 /shared/dominik/libraries/TensorMol/TensorMol/TFNetworks/TFMolInstanceDirect.py in evaluate(self, batch_data) 5705 self.EvalPrepare() 5706 feed_dict=self.fill_feed_dict(batch_data+[PARAMS["AddEcc"]]+[np.ones(self.nlayer+1)]) -> 5707 Etotal, Ebp, Ebp_atom, Ecc, Evdw, mol_dipole, atom_charge, gradient = self.sess.run([self.Etotal, self.Ebp, self.Ebp_atom, self.Ecc, self.Evdw, self.dipole, self.charge, self.gradient], feed_dict=feed_dict) 5708 #Etotal, Ebp, Ebp_atom, Ecc, Evdw, mol_dipole, atom_charge, gradient, bp_gradient, syms= self.sess.run([self.Etotal, self.Ebp, self.Ebp_atom, self.Ecc, self.Evdw, self.dipole, self.charge, self.gradient, self.bp_gradient, self.Scatter_Sym], feed_dict=feed_dict) 5709 #print ("Etotal:", Etotal, " bp_gradient", bp_gradient) /shared/dominik/miniconda3/envs/tensormol/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata) 887 try: 888 result = self._run(None, fetches, feed_dict, options_ptr, --> 889 run_metadata_ptr) 890 if run_metadata: 891 proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) /shared/dominik/miniconda3/envs/tensormol/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata) 1094 'Cannot feed value of shape %r for Tensor %r, ' 1095 'which has shape %r' -> 1096 % (np_val.shape, subfeed_t.name, str(subfeed_t.get_shape()))) 1097 if not self.graph.is_feedable(subfeed_t): 1098 raise ValueError('Tensor %s may not be fed.' % subfeed_t) ValueError: Cannot feed value of shape (100, 3) for Tensor u'Placeholder:0', which has shape '(?, 4)' — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub , or mute the thread .