mosdef-hub / foyer

A package for atom-typing as well as applying and disseminating forcefields
https://foyer.mosdef.org
MIT License
117 stars 76 forks source link

ParmEd residue matching assumes ordered residues #504

Open CalCraven opened 2 years ago

CalCraven commented 2 years ago

A summary of the question or discussion topic. So the use_residue_matching flag in the foyer.Forcefield.apply method has some inherent assumptions that are not indicated in the doc string. If you happen to have created a system with replicated molecules (noted as residues in ParmEd), then only a single one of each molecules with the same name are atomtyped. However, the unwrap_typemap function in foyer/forcefield.py uses the indices of the atoms in residue.atoms to grab the types from the typemap and copy them over to all of the other residue atoms. This inherently assumes the order of atoms is the same in every molecule.

code to reproduce issue

class CHHHOH(mb.Compound):
    def __init__(self):
        super(CHHHOH, self).__init__()
        ch3 = mb.lib.moieties.ch3.CH3()
        self.add(ch3, label="CH3")
        port = mb.Port(anchor=self[0])
        self.add(port, label="up")

        h20 = mb.lib.moieties.h2o.H2O()
        h20.remove(h20[-1])
        bond_port = h20.all_ports()[0]
        mb.force_overlap(h20, bond_port, self["up"])
        self.add(h20, label="OH")
import mbuild as mb
cpd_COHHHH = mb.load("CO", smiles=True)
cpd_COHHHH.name = "propanol"

cpd_CHHHOH = CHHHOH()
cpd_CHHHOH.name="propanol"

for parts in list(zip(cpd_COHHHH.particles(), cpd_CHHHOH.particles())):
    print(parts[0].name, parts[1].name) #see the difference in atom order for these two compounds
cpd_box = mb.packing.fill_box([cpd_COHHHH, cpd_CHHHOH], [1,1], density=1)

import foyer
opls = foyer.Forcefield(name="oplsaa")
pmd_cpd = opls.apply(cpd_box, assert_improper_params=False, use_residue_map=True, infer_residues=True,
                    verbose=True, assert_bond_params=False, assert_angle_params=False, 
                     assert_dihedral_params=False) #need to pass all of these assert_params because parmed throws a fit here
for atom in pmd_cpd.atoms:
    print(atom.name, atom.type) #you can see here that the typing was done wrong.

Output

C opls_157
O opls_154
H opls_156
H opls_156
H opls_156
H opls_155
C opls_157
H opls_154
H opls_156
H opls_156
O opls_156
H opls_155

We should note this in the doc string for forcefield.apply so people are aware of this as a possible issue. We also need to raise this infer_residues flag to the top level of apply, because use_residue_map does nothing for applying to an mBuild unless the infer_residues is also attached.