MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.33k stars 653 forks source link

Regarding the code to calculate hydrogen bond auto-correlation function and related to issue of #1563 #1844

Closed cy16f01 closed 6 years ago

cy16f01 commented 6 years ago

Sir, @orbeckst, @richardjgowers, @dotsdl, @arose I ran the code for calculating Hydrogen bond Autocorrelation function for water (my system is Glycine-Water) and the code is:

import MDAnalysis as mda
from MDAnalysis.analysis import hbonds
import matplotlib.pyplot as plt
u = mda.Universe('nptmd.tpr', 'nptmd.xtc')
H = u.select_atoms('name HW1 HW2')
O = u.select_atoms('name OW')
N = u.select_atoms('name HW1 HW2')
hb_ac = hbonds.HydrogenBondAutoCorrel(u, acceptors=O, hydrogens=H, donors=H, bond_type = 'continuous',  sample_time=10)
hb_ac.run()
print(hb_ac.solution['results'])
hb_ac.solve()

I am getting the values as: /home/dilip/anaconda3/lib/python3.6/sitepackages/MDAnalysis/analysis/hbonds/hbond_autocorrel.py:285: RuntimeWarning: Desired number of sample points too high, using 20 .format(req_frames), RuntimeWarning) Performing run 2/1[200.0%] [1. 0.40944883 0.22834645 0.15485564 0.11023622 0.0839895 0.0656168 0.03937008 0.02887139 0.02099738 0.01312336 0.01049869 0.00262467 0.00262467 0.00262467 0.00262467 0.00262467 0. 0. 0. ]

print (tau)

0.8040974832090737

print (hb_ac.solution)

{'results': array([1. , 0.40944883, 0.22834645, 0.15485564, 0.11023622, 0.0839895 , 0.0656168 , 0.03937008, 0.02887139, 0.02099738, 0.01312336, 0.01049869, 0.00262467, 0.00262467, 0.00262467, 0.00262467, 0.00262467, 0. , 0. , 0. ,

  1. , 0. , 0. , 0. , 0. ,
  2. , 0. , 0. , 0. , 0. ], dtype=float32), 'time': array([ 0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. , 5.5, 6. , 6.5, 7. , 7.5, 8. , 8.5, 9. , 9.5, 10. , 10.5,
  3. , 11.5, 12. , 12.5, 13. , 13.5, 14. , 14.5], dtype=float32), 'fit': array([0.60789634, 0.3130644 , 1.56536868]), 'tau': 0.8040974832090737, 'estimate': array([1.0000000e+00, 4.0797937e-01, 2.3191848e-01, 1.5544437e-01, 1.1029723e-01, 7.9603709e-02, 5.7729632e-02, 4.1922960e-02,. . . . -4.86388901e-03, -3.68830032e-03, -2.79195132e-03]]), 'ipvt': array([1, 2, 3], dtype=int32), 'qtf': array([-1.51693848e-09, -2.31095157e-07, 9.11203370e-07])}, 'mesg': 'Both actual and predicted relative reductions in the sum of squares\n are at most 0.000000', 'ier': 1}

So, my questions are 1] What is the default dt value in MDAnalysis? I have run a gromacs simulation of a glycine-water system for 10 ns (10000 ps) (dt=0.001 and the trajectories are saved for every 0.5 ps). So, my every frame is 0.5 ps and if the default value of the programme is 1 ps per trajectory, I think it will give a different value of lifetime for my current trajectory. So, do I need to give only trajectories saved after 1 ps?

2] (a) I am getting values for 'time': array ([ 0. , 0.5, 1. , 1.5, 2. , 2.5,......]). How can i get intervals for 0.1, 0.2, 0.3 etc?? (b) In the code, if i write "sample_time=10", it will be writing for every 10 windows... How can i increase the number of bins?

3] Since in water, the oxygen acts as both donor and acceptor, if i give
H = u.select_atoms('name OW') O = u.select_atoms('name OW') N = u.select_atoms('name HW1 HW2') hb_ac = hbonds.HydrogenBondAutoCorrel(u, acceptors=O, hydrogens=H, donors=H, bond_type = 'continuous', sample_time=10) i am getting error as: ValueError: Donors and Hydrogen groups must be matched. My question is whether every time do i need to have same numbers of donors and hydrogens? In water case, oxygen can act as a donor, so if I add oxygen list to the donor list, the number of hydrogen and donor list will not be equal.

How to solve this issues..??

Any suggestions are appreciated.

Thank you.

orbeckst commented 6 years ago

dt value

1] What is the default dt value in MDAnalysis?

MDAnalysis reads the time and the time step from your trajectory, if the trajectory format supports it. XTC does. Check with

u = mda.Universe('nptmd.tpr', 'nptmd.xtc')
print(u.trajectory.ts.dt)
print(u.trajectory.totaltime)
print(u.trajectory.n_frames)

Do you get what you expect?

times

2] (a) I am getting values for 'time': array ([ 0. , 0.5, 1. , 1.5, 2. , 2.5,......]). How can i get intervals for 0.1, 0.2, 0.3 etc??

This is automatically determined in https://github.com/MDAnalysis/mdanalysis/blob/5a232a3168c4aaa1590bc42615ebc07773f14172/package/MDAnalysis/analysis/hbonds/hbond_autocorrel.py#L327

np.arange(len(master_results), dtype=np.float32) * self.u.trajectory.dt * self._skip

from the number of samples and the number of required frames, https://github.com/MDAnalysis/mdanalysis/blob/5a232a3168c4aaa1590bc42615ebc07773f14172/package/MDAnalysis/analysis/hbonds/hbond_autocorrel.py#L282

If you want to affect the time interval, you have change the nsamples keyword in MDAnalysis.analysis.hbonds.hbond_autocorrel.HydrogenBondAutoCorrel and/or the sample_time keyword, because it is used to set req_frames https://github.com/MDAnalysis/mdanalysis/blob/5a232a3168c4aaa1590bc42615ebc07773f14172/package/MDAnalysis/analysis/hbonds/hbond_autocorrel.py#L263

donor and hydrogen selection

3] Since in water, the oxygen acts as both donor and acceptor, if i give H = u.select_atoms('name OW') O = u.select_atoms('name OW') N = u.select_atoms('name HW1 HW2') hb_ac = hbonds.HydrogenBondAutoCorrel(u, acceptors=O, hydrogens=N, donors=H, bond_type = 'continuous', sample_time=10) ... My question is whether every time do i need to have same numbers of donors and hydrogens?

(Note: You had hydrogens=H, I assume you meant hydrogens=N)

Short answer to your question: yes.

donors is tricky: from the docs

donors (AtomGroup) – The atoms which are connected to the hydrogens. This group must be identical in length to the hydrogen group and matched, ie hydrogens[0] is bonded to donors[0]. For many cases, this will mean a donor appears twice in this group.

You must be able to do

for h, d in zip(N, H):
   print(h, d)

In particular, this means that each water OW must occur twice in the donor list so that you get something like

HW1 1, OW 1
HW2 1, OW 1
HW1 2, OW 2
HW2 2, OW 2

(where I assumed you have two water molecules with resids 1 and 2). You should be able to get the appropriate donor and hydrogen groups with code similar to the following:

water = u.select_atoms("resname SOL")     # use what you need 
donors, hydrogens = [], []
for w in water.residues:
    ow, hw1, hw2 = w.atoms.select_atoms("name OW", "name HW1", "name HW2")  # use your selections
    donors.extend([ow, ow])              # note: same ow twice
    hydrogens.extend([hw1, hw2])    # once for each of the hw
donors = mda.core.groups.AtomGroup(donors)                 # raises a warning, which you should ignore 
hydrogens = mda.core.groups.AtomGroup(hydrogens)
orbeckst commented 6 years ago

NameError: name 'O' is not defined

The error tells you that you used the variable O and did not define it. In your original question you had theO atomgroup defined.

orbeckst commented 6 years ago

My basic question is that what is the difference between the two codes..?? [...] I read the literature but it is nowhere clearly mentioned, and couldn't find the difference between the two...

I don't know the answer. I would read the associated papers, write down the equations that are being solved, and compare them.

Maybe @alejob and @richardjgowers have time to comment but I wouldn't wait for it. Everyone is busy and they probably consider their papers to be sufficiently succinct.

alejob commented 6 years ago

Did you try the HB intermittent? Are both, HB intermittent, plots equals?

richardjgowers commented 6 years ago

https://www.mdanalysis.org/docs/documentation_pages/analysis/hbond_autocorrel.html#input 1) the distance is hydrogen-acceptor

2) there's two ways of defining the angle, autocorrel does the acceptor-hydrogen-donor angle, so it'll be 180 degrees for a perfect linear hydrogen bond. The other way is (180-angle) so is 0 degrees for a linear hydrogen bond, if you draw it out it'll become clear

3) if you change the distance cutoff you should get different tau values, you're calculating a different measurement. If you draw it out and consider the rates at which bonds can form/break with different geometric definitions you'll see that this should make a big difference

This doesn't seem like a coding issue, so I'm closing this and recommending that you go to to help.mdanalysis.org if you have any more questions