pharmai / plip

Protein-Ligand Interaction Profiler - Analyze and visualize non-covalent protein-ligand interactions in PDB files according to 📝 Adasme et al. (2021), https://doi.org/10.1093/nar/gkab294
http://plip.biotec.tu-dresden.de
GNU General Public License v2.0
463 stars 107 forks source link

Different results between web and command line #73

Closed xrobin closed 4 years ago

xrobin commented 4 years ago

Describe the bug

Relevant files attached: 3sgj_smr.tar.gz

I'm trying to obtain the PLIP interactions for file 3sgj_smr.pdb (derived from PDB 3SGJ, see attachment above). When I run the same file on the command line, I get very different results than when I run it on the command line

I submitted a request to the web interface by uploading this file: 68269632-0eee-493b-993d-b2a5362840e0. In the SMALLMOLECULE section, under NAG, I can see several hydrogen bonds between NAG-_-1 and several residues in chain A: LYS.24, ASP.43, ASN.75, ARG.79.

I downloaded the XML report from this web report (3sgj_smr_plipweb_report.xml) and I can clearly see those interactions in there.

When I run PLIP on the command line with

plip -f 3sgj_smr.pdb -x

I can only see NAG-_-1 making hydrogen bonds with LYS.24. I can see no interactions with ASP.43, ASN.75 or ARG.79. See the file 3sgj_smr_plipcmd_report.xml in attachment.

This XML report was generated with PLIP 2.1.3, Python 3.6.6 and OpenBabel 3.1.1. However I get similar results with our older build PLIP 1.4.2/Python 2.7.11/OpenBabel 2.4.1 (all on Centos7 systems), as well as the current code on the master branch (which I guess is equivalent to v.2.1.4).

To Reproduce Web:

Command line:

Expected behavior I was expecting to see the same interactions reported from the command line run and on the web.

What could explain these differences?

xrobin commented 4 years ago

Specifically here are the 2 interactions I can see in 3sgj_smr_plipcmd_report.xml


  <bindingsite id="10" has_interactions="True">
    <identifiers>
      <longname>NAG</longname>
      <ligtype>SMALLMOLECULE</ligtype>
      <hetid>NAG</hetid>
      <chain>_</chain>
      <position>1</position>
      <composite>False</composite>
      <members>
        <member id="1">NAG:_:1</member>
      </members>
      <smiles>OC[C@H]1O[CH][C@@H]([C@H]([C@@H]1O)O)NC(=O)C</smiles>
      <inchikey>IKCYXSONMXYELC-LXGUWJNJSA-N
</inchikey>
    </identifiers>
    <lig_properties>
      <num_heavy_atoms>14</num_heavy_atoms>
      <num_hbd>4</num_hbd>
      <num_unpaired_hbd>2</num_unpaired_hbd>
      <num_hba>6</num_hba>
      <num_unpaired_hba>4</num_unpaired_hba>
      <num_hal>0</num_hal>
      <num_unpaired_hal>0</num_unpaired_hal>
      <num_aromatic_rings>0</num_aromatic_rings>
      <num_rotatable_bonds>3</num_rotatable_bonds>
      <molweight>204.20046</molweight>
      <logp>-1.8433000000000002</logp>
    </lig_properties>
    <interacting_chains>
      <interacting_chain id="1">A</interacting_chain>
    </interacting_chains>
    <bs_residues>
      <bs_residue id="1" contact="False" min_dist="7.3" aa="LEU">20A</bs_residue>
      <bs_residue id="2" contact="False" min_dist="7.2" aa="VAL">37A</bs_residue>
      <bs_residue id="3" contact="False" min_dist="6.2" aa="PRO">23A</bs_residue>
      <bs_residue id="4" contact="False" min_dist="3.8" aa="PHE">21A</bs_residue>
      <bs_residue id="5" contact="False" min_dist="6.3" aa="GLU">36A</bs_residue>
      <bs_residue id="6" contact="False" min_dist="5.8" aa="VAL">81A</bs_residue>
      <bs_residue id="7" contact="False" min_dist="7.3" aa="PHE">19A</bs_residue>
      <bs_residue id="8" contact="False" min_dist="4.3" aa="ARG">79A</bs_residue>
      <bs_residue id="9" contact="False" min_dist="4.5" aa="VAL">40A</bs_residue>
      <bs_residue id="10" contact="False" min_dist="3.8" aa="THR">38A</bs_residue>
      <bs_residue id="11" contact="False" min_dist="6.8" aa="CYS">39A</bs_residue>
      <bs_residue id="12" contact="False" min_dist="4.5" aa="PRO">22A</bs_residue>
      <bs_residue id="13" contact="True" min_dist="2.6" aa="LYS">24A</bs_residue>
    </bs_residues>
    <interactions>
      <hydrophobic_interactions/>
      <hydrogen_bonds>
        <hydrogen_bond id="1">
          <resnr>24</resnr>
          <restype>LYS</restype>
          <reschain>A</reschain>
          <resnr_lig>1</resnr_lig>
          <restype_lig>NAG</restype_lig>
          <reschain_lig>_</reschain_lig>
          <sidechain>True</sidechain>
          <dist_h-a>2.52</dist_h-a>
          <dist_d-a>3.45</dist_d-a>
          <don_angle>162.86</don_angle>
          <protisdon>False</protisdon>
          <donoridx>4918</donoridx>
          <donortype>O3</donortype>
          <acceptoridx>151</acceptoridx>
          <acceptortype>N3</acceptortype>
          <ligcoo>
            <x>-23.495</x>
            <y>-0.723</y>
            <z>16.244</z>
          </ligcoo>
          <protcoo>
            <x>-21.042</x>
            <y>1.101</y>
            <z>14.634</z>
          </protcoo>
        </hydrogen_bond>
        <hydrogen_bond id="2">
          <resnr>24</resnr>
          <restype>LYS</restype>
          <reschain>A</reschain>
          <resnr_lig>1</resnr_lig>
          <restype_lig>NAG</restype_lig>
          <reschain_lig>_</reschain_lig>
          <sidechain>True</sidechain>
          <dist_h-a>1.72</dist_h-a>
          <dist_d-a>2.58</dist_d-a>
          <don_angle>139.53</don_angle>
          <protisdon>True</protisdon>
          <donoridx>151</donoridx>
          <donortype>N3</donortype>
          <acceptoridx>4917</acceptoridx>
          <acceptortype>O3</acceptortype>
          <ligcoo>
            <x>-21.147</x>
            <y>-1.467</y>
            <z>14.833</z>
          </ligcoo>
          <protcoo>
            <x>-21.042</x>
            <y>1.101</y>
            <z>14.634</z>
          </protcoo>
        </hydrogen_bond>
      </hydrogen_bonds>
      <water_bridges/>
      <salt_bridges/>
      <pi_stacks/>
      <pi_cation_interactions/>
      <halogen_bonds/>
      <metal_complexes/>
    </interactions>
    <mappings>
      <smiles_to_pdb>5:4908,6:4909,7:4910,8:4911,3:4912,2:4913,12:4914,14:4915,11:4916,10:4917,9:4918,4:4919,1:4920,13:4921</smiles_to_pdb>
    </mappings>
  </bindingsite>

I can see no hydrophobic interactions:

<hydrophobic_interactions/>

In contrast I can see 7 hydrogen bonds and 1 hydrophobic interactions on the web page: image

(and also in the XML file downloaded there)

fkaiserbio commented 4 years ago

Dear xrobin,

thanks for reporting this inconsistency. First of all, you should prefer to use the v2.1.4 command line tool in your case as the web server is (unfortunately) running an outdated version of PLIP. So theoretically it suffers from all the issues that we resolved recently, e.g. inconsistent hydrogen bond annotation, etc.

For your particular case it could be related to the custom input structure you are providing. This structure contains a carbohydrate polymer, which is missing the link entries present in the original PDB file required for PLIP to recognize this:

...
LINK         O4  NAG D   2                 C1  BMA D   3     1555   1555  1.41  
LINK         O3  BMA D   3                 C1  MAN D   4     1555   1555  1.43  
LINK         O6  BMA D   3                 C1  MAN D   6     1555   1555  1.42  
LINK         O2  MAN D   4                 C1  NAG D   5     1555   1555  1.43  
...

For now I can only speculate that the old version of PLIP (or maybe even the older OpenBabel version) handles the small molecules differently if no link entries are present.

If I use the structure from PDB 3sgj, the results are perfectly consistent between the web server and the command line (shown for ligand 3SGJ-NAG-D-1 at the end of this post).

The web server is maintained by our academic collaborators and afaik they are planning a relaunch currently. So, in a nutshell, I would kindly ask you to rely on the latest command line implementation for now and to provide link entries in your custom PDB files whenever you want to handle a polymer ligand. I have added a comment to the front page README that makes users aware of this issue.

Command Line

3SGJ_NAG_D_1_cmd

Web Server

3SGJ_NAG_D_1_web

xrobin commented 4 years ago

Hi Florian,

Thanks a lot for your quick answer and the useful pointer. I've been investigating the issue in a bit more depth and I think I might have found where the issue actually comes from.

I'm attaching the "fixed" files with the LINK lines, both with insertion codes (1A-1G) and without (1-8) in case you're interested to check them out.: 3sgj_smr_link.tar.gz.

I will investigate how we can avoid insertion codes in our application.

fkaiserbio commented 4 years ago

Dear Xavier,

thanks for checking that so deeply. Indeed, PLIP gets puzzled by insertion codes and all alternative locations are either:

The intended behavior that alternative locations describe monomers of a polymer is not supported at the moment. Is it fine for you to get away from insertion codes? I think implementing such a "feature" would be tedious.

Best Florian

xrobin commented 4 years ago

Unfortunately (and quite unsurprisingly) the altlocation flag has no effect on insertion codes.

That being said, I can understand that you don't want to support a legacy feature that's no longer used for residue numbering in mmCIF files.

Maybe only one thing: could PLIP display a warning when it finds an insertion code, to make it clear that it's not supported? As there were results, with bonds that actually made sense, it took me way too long to even realize that something was off.

fkaiserbio commented 4 years ago

You're right, the alternative location flag does not affect insertion codes. I will convert this to a milestone for the next release.