harryjubb / arpeggio

Calculation of interatomic interactions in molecular structures
http://biosig.unimelb.edu.au/arpeggioweb/
GNU General Public License v3.0
69 stars 24 forks source link

Arpeggio dies when PDB file includes altLoc specifier #7

Closed mchonofsky closed 5 years ago

mchonofsky commented 5 years ago

See below. Removing the altLoc specifications in 3QNM.pdb resolves the problem.

$ arpeggio.py 3QNM.pdb
ERROR//15:58:12.122//OpenBabel OBAtom with PDB serial number 1193 could not be matched to a BioPython counterpart.
Traceback (most recent call last):
  File "/homes/chonofsk/bin/arpeggio/arpeggio.py", line 752, in <module>
    raise OBBioMatchError(serial)
__main__.OBBioMatchError
harryjubb commented 5 years ago

Hi,

This is usually due to either problems with the atom serial numbering in the PDB file, or could be an alternate location-specific issue. Unfortunately distinguishing between alternate location atoms and matching them between OpenBabel and BioPython has been a challenge.

I recommend removing alternate locations either manually or with a script (e.g., see https://github.com/harryjubb/arpeggio/blob/master/README.md#biopythonopenbabel-are-complaining-about-my-structure-whats-happening).

If you need to consider contacts from both alternate locations, and you have only a few positions to consider, one option is to run Arpeggio each alternate location separately, then merge the outputs, depending on what you're interested in (e.g. if you're only looking to see if the residue makes an H-bond, you can logical AND the Hbond part of the interaction fingerprint.