openmm / openmmforcefields

CHARMM and AMBER forcefields for OpenMM (with small molecule support)
http://openmm.org
Other
222 stars 78 forks source link

No template found for residue 67 (NRQ). The set of atoms matches ..., but the bonds are different. #325

Open dishu1 opened 3 months ago

dishu1 commented 3 months ago

Hello,

I'm trying to parameterize a fluorescent ligand in a protein using Espaloma template generator but I'm struggling to understand how to fix such an issue. It does not look self-explanatory and the documentation is almost absent. What do I do?

import os
import json
from openff.toolkit import Molecule
from openmm.app import ForceField, Modeller, PDBFile, PME, HBonds, Simulation
from openmm import LangevinMiddleIntegrator
from openmm.unit import nanometer, kelvin, picosecond
from openmmforcefields.generators import EspalomaTemplateGenerator

pdb = PDBFile('3ned_pdbfixer.pdb')
project_root = os.path.dirname(os.path.abspath(''))
#Set cache path for forcefield parameters
cache_path = 'xxx'
molecules = []
molecule = Molecule.from_file('nrq_residue.mol')
molecule.name = 'NRQ'
molecule.generate_conformers(n_conformers=1)
molecules.append(molecule)
print(molecule)
pdb_file_path = 'xxx/nrq.pdb'
molecule.to_file(pdb_file_path, file_format='PDB')
espaloma = EspalomaTemplateGenerator(molecules=molecules, cache=cache_path, forcefield='espaloma-0.3.2')
print(espaloma)
# Register the template generator
forcefield = ForceField('amber/protein.ff14SB.xml', 'amber/tip3p_standard.xml')
forcefield.registerTemplateGenerator(espaloma.generator)
print('Parametrizing: ', pdb_file_path)
pdbfile = PDBFile(pdb_file_path)
modeller = Modeller(pdbfile.topology, pdbfile.positions)
box_vectors = modeller._computeBoxVectors(3.0, "cube")
modeller.topology.setPeriodicBoxVectors(box_vectors)
system = forcefield.createSystem(modeller.topology, nonbondedMethod=PME, nonbondedCutoff=1.0*nanometer, rigidWater=True, constraints=HBonds)
integrator = LangevinMiddleIntegrator(298*kelvin, 1/picosecond, 0.002*picosecond)
simulation = Simulation(pdbfile.topology, system, integrator)
simulation.context.setPositions(modeller.positions)
simulation.minimizeEnergy()
pdb = PDBFile('3ned_pdbfixer.pdb')
print(pdb.topology)
modeller = Modeller(pdb.topology, pdb.positions)
Modeller.loadHydrogenDefinitions('nrq_hydrogens.xml')
modeller.addHydrogens(forcefield, pH = 7.0)
system = forcefield.createSystem(modeller.topology)

<Topology; 5 chains, 12287 residues, 39768 atoms, 27755 bonds>
Did not recognize residue NRQ; did you forget to call .add_molecules() to add it?
No template found for residue 67 (NRQ).  The set of atoms matches [H]/[N]=[C](/[C]1=[N]/[C](=[C](/[H])[c]2[c]([H])[c]([H])[c]([O][H])[c]([H])[c]2[H])[C](=[O])[N]1[C]([H])([H])[C]([H])=[O])[C]([H])([H])[C]([H])([H])[S][C]([H])([H])[H], but the bonds are different.  For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#template"
}

The output is truncated, of course.

Issue.zip

mattwthompson commented 3 months ago

It looks like it worked fine until you changed the hydrogens?

dishu1 commented 3 months ago

How so? OpenMM does not know how to add hydrogens to NRQ unless specified in xml. I need to create the topology and assign forcefield parameters for the entire system, not for just one molecule.

mattwthompson commented 3 months ago

The template-matching code has some requirements about the molecule's atoms and bond graph being the same when you call createSystem as when the template generator is constructed. Then, you modified the topology, which may have modified the ligand, causing the template to no longer match the molecule it was constructed with. That's what the error message indicates.

dishu1 commented 3 months ago

At what moment did I modify the topology?

mattwthompson commented 3 months ago

Actually, your error is different than it appeared at first glance. You're re-creating multiple objects from different sources and assigning them the same name which confused me. This didn't throw an error

modeller = Modeller(pdbfile.topology, pdbfile.positions)
....
system = forcefield.createSystem(modeller.topology, nonbondedMethod=PME, nonbondedCutoff=1.0*nanometer, rigidWater=True, constraints=HBonds)

which indicates that the molecule used to generate the template matches the one that's in the topology. But later on the same call does throw an error

modeller = Modeller(pdb.topology, pdb.positions)
...
system = forcefield.createSystem(modeller.topology)

which led me to believe you're modifying the topology. But it's a different topology altogether. Something differs between the mol file you provided earlier (succeeds in template matching) and the ligand you have in the PDB file you use later on. Maybe a peptide bond?

dishu1 commented 3 months ago

Perhaps it is a peptide bond. I had to draw the ligand manually in Avogadro or something and then parameterize it. Also, it is confusing how exactly the ligand needs to be cut out to be parameterized. How to do all this properly?