openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
313 stars 92 forks source link

Small molecule forcefield generation method failed with RDkit incompatibility #1257

Closed OkKakao closed 2 years ago

OkKakao commented 2 years ago

I'm trying to simulate small molecule with openmm. Therefore, forcefield generation for those molecules is necessary. I found that it is possible with GAFF and SMINOFF, however, it was not working since RDkit was not compatible with am1bcc method.

I'm currently working on colab, and installed openmm with following command

import sys
print(sys.version)
!wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.10.3-Linux-x86_64.sh
!bash Miniconda3-py*.sh -bfp /usr/local
!conda config --set always_yes yes
!conda config --add channels conda-forge
!conda create -n openmm python=3.7 cudatoolkit=10.0 git jupyterlab numpy pandas scipy matplotlib ipympl rdkit openbabel openmm mdtraj pymbar pdbfixer parmed openff-toolkit openmoltools openmmforcefields openmmtools
sys.path.append('/usr/local/envs/openmm/lib/python3.7/site-packages')
import openmm.testInstallation
openmm.testInstallation.main()

Also, my code is the following.

from openmm.app import *
from openmm import *
from openmm.unit import *
from sys import stdout
from openff.toolkit.topology import Molecule
from openmmforcefields.generators import SMIRNOFFTemplateGenerator
# Create an OpenFF Molecule object for benzene from SMILES
molecule = Molecule.from_smiles('c1ccccc1')
# Create the SMIRNOFF template generator with the default installed force field (openff-1.0.0)
smirnoff = SMIRNOFFTemplateGenerator(molecules=molecule)
# Create an OpenMM ForceField object with AMBER ff14SB and TIP3P with compatible ions
forcefield = ForceField('amber14/protein.ff14SB.xml', 'amber14/tip3p.xml')
# Register the SMIRNOFF template generator
forcefield.registerTemplateGenerator(smirnoff.generator)
pdb = PDBFile('BNZ4.pdb')
system = forcefield.createSystem(pdb.topology, nonbondedMethod=NoCutoff, nonbondedCutoff=1*nanometer, constraints=HBonds)
integrator = LangevinMiddleIntegrator(300*kelvin, 1/picosecond, 0.004*picoseconds)
simulation = Simulation(pdb.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.minimizeEnergy()
simulation.reporters.append(PDBReporter('output.pdb', 1000))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True,
        potentialEnergy=True, temperature=True))
print(pdb.positions)
simulation.step(10000)

If i execute it, the following error pops up

/usr/local/envs/openmm/lib/python3.7/site-packages/openff/toolkit/typing/engines/smirnoff/parameters.py:4125: Warning: No registered toolkits can provide the capability "assign_partial_charges" for args "()" and kwargs "{'molecule': Molecule with name '' and SMILES '[H][c]1[c]([H])[c]([H])[c]([H])[c]([H])[c]1[H]', 'partial_charge_method': 'am1bcc', 'use_conformers': None, 'strict_n_conformers': False, 'normalize_partial_charges': True, '_cls': <class 'openff.toolkit.topology.molecule.FrozenMolecule'>}"
Available toolkits are: [ToolkitWrapper around The RDKit version 2020.03.3, ToolkitWrapper around Built-in Toolkit version None]
 ToolkitWrapper around The RDKit version 2020.03.3 <class 'openff.toolkit.utils.exceptions.ChargeMethodUnavailableError'> : partial_charge_method 'am1bcc' is not available from RDKitToolkitWrapper. Available charge methods are ['mmff94'] 
 ToolkitWrapper around Built-in Toolkit version None <class 'openff.toolkit.utils.exceptions.ChargeMethodUnavailableError'> : Partial charge method "am1bcc"" is not supported by the Built-in toolkit. Available charge methods are ['zeros', 'formal_charge']

  warnings.warn(str(e), Warning)
---------------------------------------------------------------------------
UnassignedMoleculeChargeException         Traceback (most recent call last)
[<ipython-input-6-da0952af17a1>](https://localhost:8080/#) in <module>()
     14 forcefield.registerTemplateGenerator(smirnoff.generator)
     15 pdb = PDBFile('BNZ4.pdb')
---> 16 system = forcefield.createSystem(pdb.topology, nonbondedMethod=NoCutoff, nonbondedCutoff=1*nanometer, constraints=HBonds)
     17 integrator = LangevinMiddleIntegrator(300*kelvin, 1/picosecond, 0.004*picoseconds)
     18 simulation = Simulation(pdb.topology, system, integrator)

5 frames
[/usr/local/envs/openmm/lib/python3.7/site-packages/openff/toolkit/typing/engines/smirnoff/parameters.py](https://localhost:8080/#) in postprocess_system(self, system, topology, **kwargs)
   3901             for ref_mol in uncharged_mols:
   3902                 msg += f"{ref_mol.to_smiles()}\n"
-> 3903             raise UnassignedMoleculeChargeException(msg)
   3904 
   3905         # Unless check is disabled, ensure that the sum of partial charges on a molecule

UnassignedMoleculeChargeException: The following molecules did not have charges assigned by any ParameterHandler in the ForceField:
[H][c]1[c]([H])[c]([H])[c]([H])[c]([H])[c]1[H]

It seems like RDkit is not compatible with am1bcc charge generation method. I requested the license of openeye module, however, it takes several weeks to verify my qualification. How can I resolve this problem? It took me a couple of weeks but still not working... BNZ4.zip

j-wags commented 2 years ago

Hi @OkKakao,

Thanks for the detailed writeup!

The AmberTools package can provide AM1BCC charges without an OpenEye license - The problem with your code above is that the way that the packages are installed in your colab code doesn't make AmberTools accessible (if I look at the conda output, ambertools is installed, it just isn't found at runtime)

I'm not a total expert in colab, but one thing I keep hearing is to "use condacolab" to help get complex conda environments set up on colab. So, updating your example for condacolab gives me cells that look like this:

!pip install -q condacolab
import condacolab
condacolab.install()

(there's a bunch of scary looking output here, and colab says "the runtime crashed" or something, but that's condacolab working as intended)

import sys
print(sys.version)
#!wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.10.3-Linux-x86_64.sh
#!bash Miniconda3-py*.sh -bfp /usr/local
!conda config --set always_yes yes
!conda config --add channels conda-forge
!mamba install git jupyterlab numpy pandas scipy matplotlib ipympl rdkit openbabel openmm mdtraj pymbar pdbfixer parmed openff-toolkit openmoltools openmmforcefields openmmtools
#sys.path.append('/usr/local/envs/openmm/lib/python3.7/site-packages')
import openmm.testInstallation
openmm.testInstallation.main()
!wget https://github.com/openforcefield/openff-toolkit/files/8480803/BNZ4.zip
!unzip BNZ4.zip
from openmm.app import *
from openmm import *
from openmm.unit import *
from sys import stdout
from openff.toolkit.topology import Molecule
from openmmforcefields.generators import SMIRNOFFTemplateGenerator
# Create an OpenFF Molecule object for benzene from SMILES
molecule = Molecule.from_smiles('c1ccccc1')
# Create the SMIRNOFF template generator with the default installed force field (openff-1.0.0)
smirnoff = SMIRNOFFTemplateGenerator(molecules=molecule)
# Create an OpenMM ForceField object with AMBER ff14SB and TIP3P with compatible ions
forcefield = ForceField('amber14/protein.ff14SB.xml', 'amber14/tip3p.xml')
# Register the SMIRNOFF template generator
forcefield.registerTemplateGenerator(smirnoff.generator)
pdb = PDBFile('BNZ4.pdb')
system = forcefield.createSystem(pdb.topology, nonbondedMethod=NoCutoff, nonbondedCutoff=1*nanometer, constraints=HBonds)
integrator = LangevinMiddleIntegrator(300*kelvin, 1/picosecond, 0.004*picoseconds)
simulation = Simulation(pdb.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.minimizeEnergy()
simulation.reporters.append(PDBReporter('output.pdb', 1000))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True,
        potentialEnergy=True, temperature=True))
print(pdb.positions)
simulation.step(10000)

And then everything seems to run just fine :-)

Could you try this out and let me know if it resolves your issue?

OkKakao commented 2 years ago

It worked! I was really frustrated because of this problem. I truly appreciate for helping me. Thanks a lot!