addMissingAtoms leads to "Multiple matching templates found"

juliebehr commented 8 years ago

I am trying to run the following:

from pdbfixer import PDBFixer

pdbid = '1OL5'
fixer = PDBFixer(pdbid=pdbid)
fixer.findMissingResidues()
fixer.findMissingAtoms()
fixer.addMissingAtoms()

And end up with this error:

Traceback (most recent call last):
  File "pdbfix-TPX2-only.py", line 12, in <module>
    fixer.addMissingAtoms()
  File "/home/behrj/miniconda/lib/python2.7/site-packages/pdbfixer/pdbfixer.py", line 804, in addMissingAtoms
    system = forcefield.createSystem(newTopology)
  File "/home/behrj/miniconda/lib/python2.7/site-packages/simtk/openmm/app/forcefield.py", line 820, in createSystem
    [template, matches] = self._getResidueTemplateMatches(res, bondedToAtom)
  File "/home/behrj/miniconda/lib/python2.7/site-packages/simtk/openmm/app/forcefield.py", line 610, in _getResidueTemplateMatches
    raise Exception('Multiple matching templates found for residue %d (%s).' % (res.index+1, res.name))
Exception: Multiple matching templates found for residue 3 (LYS).

It wasn't clear to me whether this error is originating within pdbfixer (soft.xml?) or openmm (using latest version of openmm-dev). (Note that this is a section of a script I wrote and used successfully months ago, and I've also reproduced the error using other pdbids, so I don't think the problem is inherent to this pdb. I've gotten this error with every non-protonated pdbid I've tried.)

peastman commented 8 years ago

It looks like the problem was created by this commit: https://github.com/pandegroup/openmm/commit/4e9b65cf26f10820f4c46f4fc9dadc32e2264a61. The templates in soft.xml don't include any hydrogens, so ones that would normally differ based only on hydrogens (like GLU and GLH) are identical. Previously OpenMM didn't care about that. It would just return the first matching template it found. But it now throws an exception in that case. This change was made post-7.0, so the problem only appears when using the development version of OpenMM, not the release.

We need to figure out how to deal with this. I think the change to OpenMM is good one. If there are multiple matching templates, that likely indicates an error on the user's part. So I'm inclined to say the "right" solution is to remove the duplicate templates from soft.xml. This does create one problem, though: it means once OpenMM 7.1 is released, PDBFixer 1.3 will not work correctly with it. You'll need to upgrade both OpenMM and PDBFixer together. Does that seem reasonable, or will it create problems for people?

jchodera commented 8 years ago

We can pin the range of OpenMM releases that pdbfixer 1.3 supports in the conda recipe. Should it just be pinned to 7.0rc1?

I think there is some discussion about adding additional pdbfixer support for specifying a given forcefield (to support all residue definitions from that specified forcefield) as well as adding templates for everything in the PDB. Might be worth considering the impact of this on those feature requests too.

peastman commented 8 years ago

Presumably any force field you gave it would be something that worked correctly in OpenMM. The issue is that we have to match templates without hydrogens, since of course your input file may be missing them. So we would have to take the given force field, remove hydrogens, then identify any templates that had become identical and remove the duplicate ones.

jchodera commented 8 years ago

I could see this might be a problem with small molecules, since the same small molecule heavy atoms might have different protonation states that differ only by which hydrogens are present. If we have bond order available, we could figure out what protons are needed, but this information isn't directly available in the PDB file (though it can sometimes be inferred from atom distances). There might be more effective strategies for handling small molecules, though, such as using the HETNAM fields and the PDB components directory. So it may not be a major concern.

peastman commented 8 years ago

If your molecule has missing hydrogens, there had bettern be some way for us to tell which ones they are. Otherwise, we can't add them!

jchodera commented 8 years ago

Believe me, the struggle is real.

peastman commented 8 years ago

I believe you! It just isn't a problem we can solve through the design of PDBFixer. It can't do anything with information it doesn't have.

jchodera commented 8 years ago

@peastman: I wanted to repeat the question in case you missed it, since we should figure this out before merging the pdbfixer conda recipe PR:

We can pin the range of OpenMM releases that pdbfixer 1.3 supports in the conda recipe. Should it just be pinned to 7.0rc1?

peastman commented 8 years ago

Assuming we choose the solution suggested above (just remove the duplicate templates from soft.xml and leave everything else alone), then yes. Is everyone ok with that solution?

juliebehr commented 8 years ago

Fine with me!

jchodera commented 8 years ago

I think it's reasonable to try this and see if it leaves us stuck somewhere.

peastman commented 8 years ago

Ok, the change is merged.

openmm / pdbfixer

addMissingAtoms leads to "Multiple matching templates found" #131