michellab / Sire

Sire Molecular Simulations Framework
http://siremol.org
GNU General Public License v3.0
95 stars 26 forks source link

Incorrect Excluded Atoms #342

Closed SofiaBariami closed 3 years ago

SofiaBariami commented 3 years ago

Hello, we are trying to convert some xml topology files to amber-type ones with this file. We noticed that the conversion doesn’t work properly for molecules as methane, ethane and methanol, as the list of the atoms that are excluded is wrong. What is confusing is that the excluded atoms are written in other molecules, eg neopentane Here us a list of molecules and whether excluded atoms are included . ethane No . methane No . neopentane Yes . 2-cyclopentanylindole Yes . 2-methylfuran Yes . 2-methylindole Yes . ethanol Yes . methanol No . toluene Yes . 7-cyclopentanylindole Yes

Our script detects bonds and angles correctly for all molecules and then calls CLJNBPairs() to set the scale factors for the pair of atoms that are 1-2 and 1-3. For example, if all atoms that are bonded are stored in a list called are12, the method defines the 1-2 exclusions as follows:

nbpairs = CLJNBPairs(editmol.info(), CLJScaleFactor(0,0))
for i in range(0, len(are12)):
    scale_factor1 = 0
    scale_factor2 = 0
    nbpairs.set(atoms.index( int(are12[i][0])), atoms.index(int(are12[i][1])), CLJScaleFactor(scale_factor1,scale_factor2))

For ethane for example we would have:

In [11]: molecule                                                                              
Out[11]: Molecule( 1 version 85 : nAtoms() = 8, nResidues() = 1 )

In [12]: molecule.property("intrascale")                                                       
Out[12]: CLJNBPairs( nAtoms() == 8, nGroups() == 1 )

In [13]: intr = molecule.property("intrascale")                                                

In [14]: intr.excludedAtoms()                                                                  
Out[14]: [AtomIdx(0), AtomIdx(1)]

In [15]: intr.nExcludedAtoms()                                                                 
Out[15]: 2

where the atoms with indices 0 and 1 are the two carbons:

HETATM    1  C00 MOL A   1      1.0000  1.0000  0.0000  0.00  0.00   0           
HETATM    2  C01 MOL A   1     -0.5191  1.0000  0.0000  0.00  0.00   0           

However, after calling the function AmberPrm() for the same system we see that the number of excluded atoms becomes zero and this results to wrong amber files when writing the new file with prm.writeToFile().

In [18]: prm = AmberPrm(system)  
In [19]: prm.nExcluded()                                                                       
Out[19]: 0

Amber file output:

%FLAG NUMBER_EXCLUDED_ATOMS
%FORMAT(10I8)
       1       1       1       1       1       1       1       1
%FLAG NONBONDED_PARM_INDEX

For this reason, we were wondering if there is a problem with the Sire parser.

lohedges commented 3 years ago

Hi @SofiaBariami . I'll take a look at the parser in more detail tomorrow, but I see the following when I do a round trip conversion with BioSimSpace. (Here I'm working from the demos directory.)

import BioSimSpace as BSS
from Sire.IO import AmberPrm

# Print the number of excluded atoms from the intrascale property of the original system.
s0 = BSS.IO.readMolecules(["molecules/ethane.prm7", "molecules/ethane.rst7"])
s0[0]._sire_object.property("intrascale").nExcludedAtoms()
2

# Create an AmberPrm object and check the number of excluded atoms.
amber_prm = AmberPrm(s0._sire_object)
amber_prm.nExcluded()
0

# Save to file.
amber_prm.writeToFile("tmp.prm7")

# Load back into a new system and check the new intrascale for the number of exclusions.
s1 = BSS.IO.readMolecules(["tmp.prm7", "molecules/ethane.rst7"])
s1[0]._sire_object.property("intrascale").nExcludedAtoms()
2

It seems that, although the number of excluded atoms reported by AmberPrm is incorrect, the information written to file is correct and can be reconstructed properly. Indeed, when comparing the NUMBER_EXCLUDED_ATOMS flags in the two topology files, we find that they are identical:

ethane.prm7:

%FLAG NUMBER_EXCLUDED_ATOMS
%FORMAT(10I8)
       7       6       5       4       3       2       1       1

tmp.prm7:

%FLAG NUMBER_EXCLUDED_ATOMS
%FORMAT(10I8)
       7       6       5       4       3       2       1       1
lohedges commented 3 years ago

If we just load the topology files directly using the AmberPrm parser then it reports the number of excluded atoms as 0 in both cases:

amber_prm0 = AmberPrm("molecules/molecules.prm7")
amber_prm0.nExcluded()
0

amber_prm1 = AmberPrm("tmp.prm7")
amber_prm1.nExcluded()
0

Looking at the pointers section of the files (which are identical) I see:

%FLAG POINTERS
%FORMAT(10I8)
       8       2       6       1      12       0       9       0       0       0
      29       1       1       0       0       2       2       1       2       0
       0       0       0       0       0       0       0       0       8       0
       0       0       0

From the AMBER docs we can see that the 11th entry in the pointers section records the number of excluded atoms.

%FLAG POINTERS
%FORMAT(10i8)  NATOM,    NTYPES, NBONH,  MBONA,  NTHETH, MTHETA,
               NPHIH,    MPHIA,  NHPARM, NPARM,  NNB,    NRES,
               NBONA,    NTHETA, NPHIA,  NUMBND, NUMANG, NPTRA,
               NATYP,    NPHB,   IFPERT, NBPER,  NGPER,  NDPER,
               MBPER,    MGPER,  MDPER,  IFBOX,  NMXRS,  IFCAP,
               NUMEXTRA, NCOPY
  NATOM    : total number of atoms 
  NTYPES   : total number of distinct atom types
  NBONH    : number of bonds containing hydrogen
  MBONA    : number of bonds not containing hydrogen
  NTHETH   : number of angles containing hydrogen
  MTHETA   : number of angles not containing hydrogen
  NPHIH    : number of dihedrals containing hydrogen
  MPHIA    : number of dihedrals not containing hydrogen
  NHPARM   : currently not used
  NPARM    : used to determine if addles created prmtop
  NNB      : number of excluded atoms
  NRES     : number of residues
  NBONA    : MBONA + number of constraint bonds
  NTHETA   : MTHETA + number of constraint angles
  NPHIA    : MPHIA + number of constraint dihedrals
  NUMBND   : number of unique bond types
  NUMANG   : number of unique angle types
  NPTRA    : number of unique dihedral types
  NATYP    : number of atom types in parameter file, see SOLTY below
  NPHB     : number of distinct 10-12 hydrogen bond pair types
  IFPERT   : set to 1 if perturbation info is to be read in
  NBPER    : number of bonds to be perturbed
  NGPER    : number of angles to be perturbed
  NDPER    : number of dihedrals to be perturbed
  MBPER    : number of bonds with atoms completely in perturbed group
  MGPER    : number of angles with atoms completely in perturbed group
  MDPER    : number of dihedrals with atoms completely in perturbed groups
  IFBOX    : set to 1 if standard periodic box, 2 when truncated octahedral
  NMXRS    : number of atoms in the largest residue
  IFCAP    : set to 1 if the CAP option from edit was specified
  NUMEXTRA : number of extra points found in topology
  NCOPY    : number of PIMD slices / number of beads

However, looking at the Sire code:

/** Return the number of excluded atoms */
int AmberPrm::nExcluded() const
{
    if (pointers.count() > 9)
        return pointers[9];
    else
        return 0;
}

It is returning the value of the 10th pointer, pointers[9] ,rather than the value of pointers[10]. (In this case pointers[9] is zero.)

I'll fix this in the code, as well as making sure that the pointer is being used correctly elsewhere in the code.

lohedges commented 3 years ago

It looks like it's just an indexing error when returning the number of excluded atoms, since the index is used correctly on assignment:

// save the number of excluded atoms to 'pointers'
pointers[10] = std::get<2>(excl_lines);     // NNB

(There is a similar indexing error for NRES.)

I think the parser is working correctly, since I wouldn't be able to do the round-trip and get excactly the same records otherwise. If your NUMBER_EXCLUDED_ATOMS flag doesn't look right, then I imagine that it's an issue with the way the intrascale matrix was constructed.

SofiaBariami commented 3 years ago

Thank you very much Lester. It makes sense. I'll have a look at the script to see what's wrong with the intrascale property.