OpenBioSim / biosimspace

An interoperable Python framework for biomolecular simulation.
https://biosimspace.openbiosim.org
GNU General Public License v3.0
77 stars 12 forks source link

[BUG?] Incompatible AMBER atom names for protein cap residues #365

Open jmichel80 opened 1 week ago

jmichel80 commented 1 week ago

I have come across something annoying with AMBER templates.   So if you export a PDB from Flare version 8.0.2 in 'AMBER PDB' mode it writes ACE and NME residues with atom names consistent with those found in  

$AMBERHOME/dat/leap/lib/aminont10.lib and 
$AMBERHOME/dat/leap/lib/aminoct10.lib

  In these lib files we have for ACE  

!entry.ACE.unit.atoms table  str name  str type  int typex  int resx  int flags  int seq  int elmnt  dbl chg
 "HH31" "HC" 0 1 131072 1 1 0.112300
 "CH3" "CT" 0 1 131072 2 6 -0.366200
 "HH32" "HC" 0 1 131072 3 1 0.112300
 "HH33" "HC" 0 1 131072 4 1 0.112300
 "C" "C" 0 1 131072 5 6 0.597200

 for NME

!entry.NME.unit.atoms table  str name  str type  int typex  int resx  int flags  int seq  int elmnt  dbl chg
 "N" "N" 0 1 131072 1 7 -0.415700
 "H" "H" 0 1 131072 2 1 0.271900
 "CH3" "CT" 0 1 131072 3 6 -0.149000
 "HH31" "H1" 0 1 131072 4 1 0.097600
 "HH32" "H1" 0 1 131072 5 1 0.097600
 "HH33" "H1" 0 1 131072 6 1 0.097600

    However ambertools (here 23.6 from conda-forge) now seems to load lib files for all their forcefields from  

$AMBERHOME/dat/leap/lib/aminont12.lib and 
$AMBERHOME/dat/leap/lib/aminoct12.lib

  In these entries ACE is specified as  

!entry.ACE.unit.atoms table  str name  str type  int typex  int resx  int flags  int seq  int elmnt  dbl chg
 "H1" "HC" 0 1 131072 1 1 0.112300
 "CH3" "CT" 0 1 131072 2 6 -0.366200
 "H2" "HC" 0 1 131072 3 1 0.112300
 "H3" "HC" 0 1 131072 4 1 0.112300
 "C" "C" 0 1 131072 5 6 0.597200
 "O" "O" 0 1 131072 6 8 -0.567900
(…)

  And NME as  

!entry.NME.unit.atoms table  str name  str type  int typex  int resx  int flags  int seq  int elmnt  dbl chg
 "N" "N" 0 1 131072 1 7 -0.415700
 "H" "H" 0 1 131072 2 1 0.271900
 "C" "CT" 0 1 131072 3 6 -0.149000
 "H1" "H1" 0 1 131072 4 1 0.097600
 "H2" "H1" 0 1 131072 5 1 0.097600
 "H3" "H1" 0 1 131072 6 1 0.097600
(…)

  As a result  trying to parameterise a protein exported from Flare   protein = BSS.Parameters.ff14SB(protein, work_dir='tmp').getMolecule()   fails with this leap error message  

FATAL:  Atom .R<ACE 1>.A<HH32 7> does not have a type.
FATAL:  Atom .R<ACE 1>.A<HH33 8> does not have a type.
FATAL:  Atom .R<ACE 1>.A<HH31 9> does not have a type.
FATAL:  Atom .R<NGLU 3>.A<H 18> does not have a type.
FATAL:  Atom .R<NME 164>.A<CH3 7> does not have a type.
FATAL:  Atom .R<NME 164>.A<HH31 8> does not have a type.
FATAL:  Atom .R<NME 164>.A<HH32 9> does not have a type.
FATAL:  Atom .R<NME 164>.A<HH33 10> does not have a type.

 

I wonder if there is a sensible way to automatically deal with inconsistent amber naming schemes behind the scenes when calling   protein_xtal = BSS.IO.readPDB('inputs/3STD_prep.pdb', pdb4amber=True)[0]   Note how pdb4amber doesn’t seem to catch this issue.

To reproduce the issue

3STD.zip

import BioSimSpace as BSS

protein_xtal = BSS.IO.readPDB('inputs/3STD_prep.pdb', pdb4amber=True)[0]

protein = protein_xtal.extract(
   [atom.index() for atom in protein_xtal.search("not resname HOH").atoms()]
)

protein = BSS.Parameters.ff14SB(protein, work_dir='tmp').getMolecule()

cat tmp/leap.log

tagging @mark-mackey-cresset as this is perhaps more a Flare bug than a BioSimSpace bug

lohedges commented 1 week ago

Ooof, that's annoying. It's not even that we could load the most recent template from the ambertools dat directory, then use Sire to rename the residues, since the charge is different too.

lohedges commented 1 week ago

I guess that wouldn't matter if you're reparametrising anyway, so maybe we could just swap the template. The fiddly bit would be working out what template was included. We could probably get this by parsing the files after the failed parameterisation and following the imports.

mark-mackey-cresset commented 1 week ago

We found this a while ago and have been trying to work out what to do about it, as the atom names that AMBER expects seem to vary depending on the context. As I recall the ambertools have a dictionary of synonyms to work around this issue, but we should probably update our code to produce the new amber names rather than the old ones.

mark-mackey-cresset commented 1 week ago

Also, @jmichel80 you should upgrade to Flare 9 :)