Closed pwnusmaximus closed 1 year ago
While looking through the MDAnalysis.log file it appears that selection2_type is ignored when selection1_type is set. ie. It appears that selection 2 and selection 1 are inverse of each other. For most cases this makes sense. However, for second order water bridges an Acceptor <-> WAT <-> WAT <-> Acceptor is perfectly reasonable.
Perhaps there is a method to overrule this behaviour and I didn't do it properly.
@xiki-tempula as you're the orginal author of the wbridge code, can you have a look at this question/issue please? Thanks!
@Pwnus Thanks for the issue. I will have a look in the weekends. I wonder why do you do distance_type = 'heavy'
?
Hi @xiki-tempula, for me distance_type = 'heavy' is just a convention. Our usual analysis software of choice ,cpptraj from Ambertools2021, defaults to heavy atom distances and I've grown accustomed to it as a measure.
I can reproduce the problem. Let me have a look.
Ok, I found out why this failed. selection1_type
is a keyword argument, but selection2_type
is not a keyword argument.
selection1_type : {"donor", "acceptor", "both"} (optional)
Selection 1 can be 'donor', 'acceptor' or 'both'. Note that the
value for `selection1_type` automatically determines how
`selection2` handles donors and acceptors: If `selection1` contains
'both' then `selection2` will also contain 'both'. If `selection1`
is set to 'donor' then `selection2` is 'acceptor' (and vice versa).
['both'].
So the code only checks for three cases,
When selection1_type='donor'
, it assumes selection1_type='donor', selection2_type='acceptor'
.
When selection1_type='acceptor'
, it assumes selection1_type='acceptor', selection2_type='donor'
.
When selection1_type='both'
, it assumes selection1_type='both', selection2_type='both'
.
So when it sees selection1_type='acceptor', selection2_type='acceptor'
, it just reverts to selection1_type='acceptor', selection2_type='donor'
.
Ok, there is another thing. Sorry if the doc is not very clear but donors
is actually the heavy atom associated with the donor.
Sorry the distance_type = 'heavy'
is not very well maintained mainly because I originally don't understand how this works and kind of inherit the code from other part of the MDA.
However, the distance_type='hydrogen'
does work. In this case, it would be
import MDAnalysis as mda
#Create Universe
parm = "2w35-DL-HID.prmtop"
traj = "2w35-DL-HID-rep2-500ns-10frames.mdcrd"
u = mda.Universe(parm,traj) # always start with a Universe
w = MDAnalysis.analysis.hydrogenbonds.WaterBridgeAnalysis(u,
'resname ASP and name OD2',
'resname DA and name P',
update_selection=True,
water_selection='resname WAT',
order=2,
update_water_selection=True,
distance=4.0, angle=120.0,
debug=True,
verbose=True,
acceptors=('P', 'OD1', 'OD2', 'O'),
donors=('O'),
filter_first=True
)
w.run()
print(w.timeseries)
Gives
[[(1817, 4251, ('ASP', 110, 'OD2'), ('WAT', 351, 'H2'), 3.393591355721304, 133.6406052508985), (4251, 4252, ('WAT', 351, 'H2'), ('WAT', 352, 'O'), 3.1510935551442802, 129.3098825714121), (4253, 3774, ('WAT', 352, 'H1'), ('DA', 230, 'P'), 2.9032325126961362, 142.4088206311063), (720, 4254, ('ASP', 43, 'OD2'), ('WAT', 352, 'H2'), 1.8726619038368872, 132.56006363521206), (720, 4292, ('ASP', 43, 'OD2'), ('WAT', 365, 'H1'), 1.850974019502122, 150.58244069848672)], [(1817, 4253, ('ASP', 110, 'OD2'), ('WAT', 352, 'H1'), 1.849568207751335, 172.96529196011156), (4254, 9151, ('WAT', 352, 'H2'), ('WAT', 1985, 'O'), 1.8087280419588374, 148.28580215335526), (9153, 3774, ('WAT', 1985, 'H2'), ('DA', 230, 'P'), 2.246521446223992, 167.5724465796556), (720, 4254, ('ASP', 43, 'OD2'), ('WAT', 352, 'H2'), 3.429838909172761, 139.9579791603104), (720, 4293, ('ASP', 43, 'OD2'), ('WAT', 365, 'H2'), 1.8959299736466242, 171.89039279707592), (720, 9152, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H1'), 1.8671204264943673, 153.31218009273726), (720, 14307, ('ASP', 43, 'OD2'), ('WAT', 3703, 'H2'), 3.8169042950036887, 144.42441139786533)], [(720, 4257, ('ASP', 43, 'OD2'), ('WAT', 353, 'H2'), 3.64963647552059, 139.45071906498148), (4256, 3774, ('WAT', 353, 'H1'), ('DA', 230, 'P'), 2.834134138868142, 129.92633270270065), (720, 4293, ('ASP', 43, 'OD2'), ('WAT', 365, 'H2'), 1.818111829419461, 161.60502328555168), (720, 9153, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H2'), 2.015430888351284, 136.59739586186618)], [(720, 9153, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H2'), 1.8203121752801656, 158.58513345751518), (9151, 4293, ('WAT', 1985, 'O'), ('WAT', 365, 'H2'), 3.479766986115513, 133.83175881878554), (4293, 3774, ('WAT', 365, 'H2'), ('DA', 230, 'P'), 3.060998864481397, 140.26838790231153), (9153, 4291, ('WAT', 1985, 'H2'), ('WAT', 365, 'O'), 3.608196269337274, 122.48698663619565)], [(720, 4293, ('ASP', 43, 'OD2'), ('WAT', 365, 'H2'), 1.843827286086128, 144.45964232357517), (4292, 3774, ('WAT', 365, 'H1'), ('DA', 230, 'P'), 2.8939357380099944, 134.56966096070747), (4292, 5590, ('WAT', 365, 'H1'), ('WAT', 798, 'O'), 3.1605338845771804, 138.05900583208216), (720, 5591, ('ASP', 43, 'OD2'), ('WAT', 798, 'H1'), 3.9242542859284426, 149.26346158504964), (720, 13614, ('ASP', 43, 'OD2'), ('WAT', 3472, 'H2'), 1.7618811985131957, 154.11506898283204)], [(720, 9152, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H1'), 1.892213094315617, 158.488408366625), (9151, 16995, ('WAT', 1985, 'O'), ('WAT', 4599, 'H2'), 1.957615569970479, 165.2898111917652), (16994, 3774, ('WAT', 4599, 'H1'), ('DA', 230, 'P'), 2.7888123616948706, 134.82884905540615)], [(1817, 4106, ('ASP', 110, 'OD2'), ('WAT', 303, 'H1'), 1.833734411648239, 148.7257072909245), (4107, 9151, ('WAT', 303, 'H2'), ('WAT', 1985, 'O'), 2.4095311772259493, 122.19942481564095), (9152, 3774, ('WAT', 1985, 'H1'), ('DA', 230, 'P'), 3.5913188516541963, 146.17430132993326), (720, 4292, ('ASP', 43, 'OD2'), ('WAT', 365, 'H1'), 1.8416829164441975, 152.57002058611997), (720, 9153, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H2'), 1.8229850078234018, 177.70100450495477)], [(1817, 4106, ('ASP', 110, 'OD2'), ('WAT', 303, 'H1'), 1.740858427970088, 151.83055523471987), (4107, 6952, ('WAT', 303, 'H2'), ('WAT', 1252, 'O'), 3.445888301940925, 140.5532946438981), (6953, 3774, ('WAT', 1252, 'H1'), ('DA', 230, 'P'), 3.2492178833565544, 141.50511711223422), (720, 4293, ('ASP', 43, 'OD2'), ('WAT', 365, 'H2'), 1.951102535539977, 158.61324639824332), (720, 6954, ('ASP', 43, 'OD2'), ('WAT', 1252, 'H2'), 1.9869096728458162, 155.66059556502225), (720, 13515, ('ASP', 43, 'OD2'), ('WAT', 3439, 'H2'), 3.9463436426809833, 132.2588781538232)], [(720, 4292, ('ASP', 43, 'OD2'), ('WAT', 365, 'H1'), 2.2405464382157216, 122.84506810889205), (4292, 3774, ('WAT', 365, 'H1'), ('DA', 230, 'P'), 3.1773631323951115, 126.97784464396653), (4291, 6954, ('WAT', 365, 'O'), ('WAT', 1252, 'H2'), 3.748528653581033, 123.84650122269805), (720, 6954, ('ASP', 43, 'OD2'), ('WAT', 1252, 'H2'), 3.5016609783952184, 167.98769036196092), (1817, 9152, ('ASP', 110, 'OD2'), ('WAT', 1985, 'H1'), 1.9499123859961573, 145.10239635368376), (720, 9153, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H2'), 1.5926231286622126, 175.76655312759172)], []]
This is also fixed in https://github.com/MDAnalysis/mdanalysis/pull/4066
import MDAnalysis as mda
#Create Universe
parm = "2w35-DL-HID.prmtop"
traj = "2w35-DL-HID-rep2-500ns-10frames.mdcrd"
u = mda.Universe(parm,traj) # always start with a Universe
w = MDAnalysis.analysis.hydrogenbonds.WaterBridgeAnalysis(u,
'resname ASP and name OD2',
'resname DA and name P',
update_selection=True,
water_selection='resname WAT',
order=2,
update_water_selection=True,
distance=4.0, angle=120.0,
distance_type = 'heavy',
debug=True,
verbose=True,
acceptors=('P', 'OD1', 'OD2', 'O'),
donors=('O'),
filter_first=True
)
w.run()
print(w.timeseries)
[[(720, 4254, ('ASP', 43, 'OD2'), ('WAT', 352, 'H2'), 2.6173148105009867, 132.56006363521206), (4253, 3774, ('WAT', 352, 'H1'), ('DA', 230, 'P'), 3.7078930697486436, 142.4088206311063), (720, 4292, ('ASP', 43, 'OD2'), ('WAT', 365, 'H1'), 2.7255888461800195, 150.58244069848672)], [(1817, 4253, ('ASP', 110, 'OD2'), ('WAT', 352, 'H1'), 2.8022313402893086, 172.96529196011156), (4254, 9151, ('WAT', 352, 'H2'), ('WAT', 1985, 'O'), 2.670479345137013, 148.28580215335526), (9153, 3774, ('WAT', 1985, 'H2'), ('DA', 230, 'P'), 3.1885889101394085, 167.5724465796556), (720, 4293, ('ASP', 43, 'OD2'), ('WAT', 365, 'H2'), 2.846776898715446, 171.89039279707592), (720, 9152, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H1'), 2.756608767156218, 153.31218009273726)], [(720, 4293, ('ASP', 43, 'OD2'), ('WAT', 365, 'H2'), 2.742949075816077, 161.60502328555168), (4291, 4257, ('WAT', 365, 'O'), ('WAT', 353, 'H2'), 2.9116107495969175, 160.3826600196096), (4256, 3774, ('WAT', 353, 'H1'), ('DA', 230, 'P'), 3.525871616409098, 129.92633270270065), (720, 9153, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H2'), 2.789022173597509, 136.59739586186618)], [], [(720, 4293, ('ASP', 43, 'OD2'), ('WAT', 365, 'H2'), 2.680570453923368, 144.45964232357517), (4292, 3774, ('WAT', 365, 'H1'), ('DA', 230, 'P'), 3.6304643721412186, 134.56966096070747), (720, 13614, ('ASP', 43, 'OD2'), ('WAT', 3472, 'H2'), 2.6558193872943874, 154.11506898283204)], [(720, 9152, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H1'), 2.8048798451527235, 158.488408366625), (9151, 16995, ('WAT', 1985, 'O'), ('WAT', 4599, 'H2'), 2.8941625694269955, 165.2898111917652), (16994, 3774, ('WAT', 4599, 'H1'), ('DA', 230, 'P'), 3.5297488120647786, 134.82884905540615)], [], [], [(720, 4292, ('ASP', 43, 'OD2'), ('WAT', 365, 'H1'), 2.873803650917737, 122.84506810889205), (4292, 3774, ('WAT', 365, 'H1'), ('DA', 230, 'P'), 3.829556436424445, 126.97784464396653), (720, 9153, ('ASP', 43, 'OD2'), ('WAT', 1985, 'H2'), 2.549059271453923, 175.76655312759172), (1817, 9152, ('ASP', 110, 'OD2'), ('WAT', 1985, 'H1'), 2.7889963607587984, 145.10239635368376)], []]
Thank you xiki-tempula
I've made notes to myself for that little "gotcha" to look out for when running this type of analysis and the correct options to ask for. Thank you for the solution :)
It'll be highly valuable to me and my lab-mates.
@pwnusmaximus Please don't close the issue yet as we still need to merge the fix for the distance_type = 'heavy'
@pwnusmaximus Please don't close the issue yet as we still need to merge the fix for the distance_type = 'heavy'
My mistake, I've re-opened the issue,
Expected behavior
I want to quantify a second order water bridge (two waters between a protein residue (resid 43 and name OD2) and a DNA nucleotide (resid 230 and name P). I would like a time series of the presence of these bonds.
resid 43 = resname ASP resid 230 = resname DA
Troubleshooting thus far
Notably, selection two is a phosphorus atom. (not a usual hydrogen bond acceptor atom) I know this 'P' isn't in any standard library so I added it with the "acceptors" keyword in addition to forcing selection2_type to be an acceptor. I tried troubleshooting by enlarging the acceptable ranges for angle and distance, and even manually specifying every donor and acceptor used in the interaction in-case any were not in the default atom mask libraries. I also played with using resid, resname or simply atom masks to see if the selection was the issue. I tried on a full trajectory and truncated trajectories where I've manually extracted the frames with water bridges present.
I've verified visually in pymol that the water bridge is indeed present and within spec for the identities, angles and distances I used.
Actual behavior
you can see on the last line no hydrogen bonds are identified. I've opened up the 'w' dataframe and confirmed that the universe was created successfully and that the correct acceptors and donors are listed.
Code to reproduce the behavior
Current version of MDAnalysis
Attached files: 2w35-DL-HID-rep2-500ns-10frames.zip 2w35-DL-HID.zip MDAnalysis.log