jparkhill / TensorMol

Tensorflow + Molecules = TensorMol
http://blogs.nd.edu/parkhillgroup
GNU General Public License v3.0
271 stars 75 forks source link

Inconsistent use of tabs and spaces in indentation #9

Closed xiexr151e closed 6 years ago

xiexr151e commented 6 years ago

When attempting to even import TensorMol on the Python interpretor, the Python interpretor complains about inconsistent use of tabs and spaces. The error occurs at line 282 in QuasiNewtonTools.py, but I know I've seen more inconsistent uses of tabs and spaces elsewhere (for example, in test_h2o.py).

jparkhill commented 6 years ago

Thanks for bringing this to my attention, I fixed the instances in those two files All the developers are still using python2.7X, so you'll have better luck with that.

xiexr151e commented 6 years ago

I just have a question, regarding your tutorial document and the pre-trained neural networks provided in the README. It occurred to me that the program wants a .tfm file, while the downloaded file was a .tfn file, and also the names in the tutorial and the neural network file does not match. Is the intended file in the tutorial "Mol_H2O_wb97xd_1to21_with_prontonated_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100_rightalpha_nodropout"?

xiexr151e commented 6 years ago

Hello again, Since I am not sure how to ask my questions on the wiki page, I would like to ask them here. I am interested in finding the energy of the molecule, but it looks like EnAndForce() in the test script needs coordinates of molecules. Though, I am not sure how would I get the coordinates to feed into the function. Is it anything related to a.ReadXYZ()?

kunkinger commented 6 years ago

Hello Derek,

Unzip the networks.tar.gz should produce a folder named "networks" that contains following files and folders:

chemspider12_nosolvation.tfm

Mol_chemspider12_maxatom35_H2O_with_CH4_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100_rightalpha.tfn

Mol_chemspider12_maxatom35_H2O_with_CH4_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100_rightalpha

chemspider12_solvation.tfm

Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100.tfn

Mol_chemspider12_clean_maxatom35_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100

water_network.tfm

Mol_H2O_wb97xd_1to21_with_prontonated_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100_rightalpha_nodropout.tfn

Mol_H2O_wb97xd_1to21_with_prontonated_ANI1_Sym_Direct_RawBP_EE_ChargeEncode_Update_vdw_DSF_elu_Normalize_Dropout_act_sigmoid100_rightalpha_nodropout

We have renamed the .tfm files for easy understanding. You just need to move these files and folders into the "networks" folder in your TensorMol folder, then you should be able to run the test_tensormol01.py .

I do not think there is any change made to MolEmb. You should have the MolEmb.so file in your TensorMol folder after you execute "sudo pip install -e . " in the TensorMol home folder. If you have any questions, I will be happy to answer them

-Best,

Kun

On Wed, Nov 22, 2017 at 6:41 PM, Derek X. R. Tse notifications@github.com wrote:

Hello again, The build from yesterday allowed me to execute this package, but if I reinstall it, it states that there is no module named MolEmb. Were there any changes made...?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jparkhill/TensorMol/issues/9#issuecomment-346501925, or mute the thread https://github.com/notifications/unsubscribe-auth/AHG2WMff_kcApkblzzj-lzWvGVG4UUWOks5s5LESgaJpZM4QlE1o .

xiexr151e commented 6 years ago

Are there any specific requirements for the input files? I am trying to test the module with my own input (it's just an xyz file with 10 identical water molecules), but I get this error:

ValueError: zero-size array to reduction operation maximum which has no identity

This is from TensorMolData.py, line 1233, in init: self.MaxNAtoms = np.max([m.NAtoms() for m in self.set.mols])

To which it calls upon a numpy function.

kunkinger commented 6 years ago

The second line of your xyz file needs start with string "Comment:". Such as: 3 Comment: single water O 0 0 0 H 0 0 -1 H 0 0 1

On Mon, Nov 27, 2017 at 6:24 PM, Derek X. R. Tse notifications@github.com wrote:

Are there any specific requirements for the input files? I am trying to test the module with my own input (it's just an xyz file with 10 identical water molecules), but I get this error:

ValueError: zero-size array to reduction operation maximum which has no identity

This is from TensorMolData.py, line 1233, in init: self.MaxNAtoms = np.max([m.NAtoms() for m in self.set.mols])

To which it calls upon a numpy function.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/jparkhill/TensorMol/issues/9#issuecomment-347362831, or mute the thread https://github.com/notifications/unsubscribe-auth/AHG2WKs6VSLTg8JUVQJdIo0zromkjeM2ks5s60SwgaJpZM4QlE1o .

jparkhill commented 6 years ago

Actually Kun and Derek, I just eliminated this limitation from the package, just ordinary xyz's should work now, as long as all atom names are characters.

Best- John

xiexr151e commented 6 years ago

Hello again,

May I ask what are the units you used for the energy output in your calculations (from EnAndForce)? For a water monomer, we get -.0023 from TensorMol.

kunkinger commented 6 years ago

As now mentioned in the tutorial http://tensormol.readthedocs.io/en/latest/Tutorials.html#units the energy is given in Hartrees. The zero of energy is the average atomization energy of the set used to train that network.

xiexr151e commented 6 years ago

Hello,

Is there a way to directly feed XYZ input into energy calculation functions, rather than using ReadXYZ()?

jordangarside commented 6 years ago

@xiexr151e molecule = Mol() molecule.atoms = atoms # an array of nuclear charges molecule.coords = atomic_coords # an [atomCount x 3] array of positions

jparkhill commented 6 years ago

if En(m) returns the energy given a molecule and the atoms are "atoms" EnergyFunctionOfCoords=lambda X: En(Mol(atoms,X))

xiexr151e commented 6 years ago

@jordangarside Could you please elaborate? So, to construct a water molecule, it would look like: molecule = Mol() molecule.atoms = [0,0,0] moolecule.coords = [O 0 0 0, H 0 0 1, H 0 0 -1]

Correct?

jordangarside commented 6 years ago

@xiexr151e

molecule = Mol()
molecule.atoms  = np.array([8, 1, 1])
molecule.coords = np.array([
    [0, 0, 0],
    [0, 0, 1],
    [0, 0, -1]
])
jparkhill commented 6 years ago

Mol(np.array([8,0,0]),np.array([ [0.,0.,0.],[0.,-1.,-1.],[0,-1,1]]))

jeherr commented 6 years ago

These need to be passed in as numpy arrays as @jparkhill mentions here. Python lists won't work.

xiexr151e commented 6 years ago

So, no need to pass in the symbols?

jeherr commented 6 years ago

No the molecule.atoms variable goes by atomic number. So molecule.atoms = np.array([1, 1, 8]) would work for a water molecule. molecule.coords is just the xyz (no atomic number or symbol) in the same order as molecule.atoms.

xiexr151e commented 6 years ago

Regarding the water network, is it possible to do calculations outside of the TensorMol directory and not move the huge water network dataset depending on where I run the script? I am attempting to write some scripts on my own, and I wish to not move the network file to different directories.