schrodinger / pymol-open-source

Open-source foundation of the user-sponsored PyMOL molecular visualization system.
https://pymol.org/
Other
1.2k stars 281 forks source link

Fix H-bonding issue in DistSet SelectorGetDistSet #374

Open norbert1000 opened 5 months ago

norbert1000 commented 5 months ago

When using the dist command with mode=2 for structures that have sensibly placed polar hydrogens, it becomes apparent that many hydrogen bonds with very good geometry are missed. The problem only affects atom pairs in which both atoms can have a donor and an acceptor function (i.e. waters, Ser, Thr, His and ligands).

Origin of this behavior is the DistSet SelectorGetDistSet function in layer3/Selector.cpp:

if(ai1->hb_donor && ai2->hb_acceptor) { //check for suitable geometry } else if(ai1->hb_acceptor && ai2->hb_donor) { //check for suitable geometry } else { //no hydrogen bond }

norbert1000 commented 5 months ago

I have tested the modification by @folmilsch for two structures with sensibly placed hydrogen positions and also in the absence of hydrogen atoms. It appears to work fine and the hydrogen bonds that were missing when using the original routine are now present.

JarrettSJohnson commented 5 months ago

Thanks for the PR! Excited to try this out. Are you able to point to examples of atom pairs in the PDB for me to inspect? Also, has an analysis been done for possible false positives that could arise?

norbert1000 commented 5 months ago

An example is the following file which contains hydrogens placed to chain A of pdb id 8eyj: https://research.uni-leipzig.de/straeter/pymol/data/8eyj_protoss_chainA.pdb

You can generate similar examples for any high resolution pdb structure which contains many water molecules. For hydrogen bonds between water molecules I think about half of the hydrogen bonds will be missed by the current algorithm. You need to place the polar hydrogens in sensible positions by a suitable software. In the above example I used protoss because it is freely available by a Web server. But the commercial Schrodinger suite (Maestro?) likely also has such an algorithm. In the deposited pdb files, hydrogens are either missing (because they are usually not visible in the electron density maps) or they have been added by the crystallographic refinement software in a non-sensible way for many polar hydrogens. This is the reason why the problem likely has hardly been noted as people often calculate the "hydrogen bonds" without relying on the hydrogens but using the generous parameters of the dist(mode=2) algorithm. This has been described in: https://sourceforge.net/p/pymol/mailman/message/21596748/

Back to the example of 8eyj_protoss_chainA.pdb. Please check the following examples: load 8eyj_protoss_chainA.pdb dist hbonds, all, all, mode=2 center solvent and resi 725 (also several other waters nearby) Also examples for protein side-chains: center resn Thr and resi 21 (missed H-bonds for Thr21-Thr26 and water-Thr) center resn Ser and resi 121

If you walk along the sequence to Ser, Thr or His, you will find many examples, mostly to waters as direct interactions between these residues are less likely than to water.

I did not notice unexpected false positives by the modification. The hydrogen bonds are assigned based on the geometric criteria in the ObjectMoleculeTestHBond function. With the default value of 63° for the h_bond_max_angle parameter quite weak H-bonds will also be included, but this parameter can be changed and this behavior is independent of the code modification. For

I think that the issue has been left unnoticed in the original implementation as people rarely use sensibly placed hydrogen atom positions and the h_add command does not help here. But it is a big improvement to use such algorithms to analyze hydrogen bonds. For a local PyMOL course for PhD students I described this here: https://research.uni-leipzig.de/straeter/pymol/pymol_hbond.html I will add some of this information to the PyMOL wiki at some point after studying the hbond* parameters in more detail.

JarrettSJohnson commented 5 months ago

Thanks! I haven't forgotten about this, but will briefly be unavailable until July. I'll review this afterward. Thanks!