openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
318 stars 92 forks source link

Conformer generation failed for some of the benchmarkset molecules #321

Closed proteneer closed 5 years ago

proteneer commented 5 years ago

For some of the ring-cage host guest systems, the conformer generation step (possibly from charge assignment) fails:

from simtk.openmm.app import PDBFile
from openforcefield.topology import Molecule, Topology
from openforcefield.typing.engines.smirnoff import ForceField

sdf_mol = Molecule('/home/yzhao/Code/benchmarksets/input_files/cb7-set1/sdf/guest-17.sdf')
force_field = ForceField('smirnoff99Frosst.offxml')

# Parametrize the ligand molecule by creating a Topology object from it
ligand_system = force_field.create_openmm_system(sdf_mol.to_topology())
Unable to load toolkit <openforcefield.utils.toolkits.RDKitToolkitWrapper object at 0x7f52a02690b8>.
Unable to load toolkit <openforcefield.utils.toolkits.AmberToolsToolkitWrapper object at 0x7f52a02690b8>.
Warning: OEMMFFParams::PrepMol() : unable to type atom 0 N 
Warning: OEMMFFParams::PrepMol() : unable to type atom 1 N 
Warning: OEMMFFParams::PrepMol() : unable to type atom 12 H 
Warning: OEMMFFParams::PrepMol() : unable to type atom 13 H 
Warning: OEMMFFParams::PrepMol() : unable to type atom 14 H 
Warning: OEMMFFParams::PrepMol() : unable to type atom 15 H 
Warning: OEMMFFParams::PrepMol() : unable to type atom 16 H 
Warning: OEMMFFParams::PrepMol() : unable to type atom 17 H 
Warning: force field setup failed due to missing parameters for molecule MOL
Traceback (most recent call last):
  File "system_setup.py", line 9, in <module>
    ligand_system = force_field.create_openmm_system(sdf_mol.to_topology())
  File "/home/yzhao/Code/timemachine/openforcefield/openforcefield/typing/engines/smirnoff/forcefield.py", line 863, in create_openmm_system
    parameter_handler.create_force(system, topology, **kwargs)
  File "/home/yzhao/Code/timemachine/openforcefield/openforcefield/typing/engines/smirnoff/parameters.py", line 2027, in create_force
    temp_mol.generate_conformers(num_conformers=10, toolkit_registry=toolkit_registry)
  File "/home/yzhao/Code/timemachine/openforcefield/openforcefield/topology/molecule.py", line 1947, in generate_conformers
    clear_existing=clear_existing)
  File "/home/yzhao/Code/timemachine/openforcefield/openforcefield/utils/toolkits.py", line 3027, in call
    return method(*args, **kwargs)
  File "/home/yzhao/Code/timemachine/openforcefield/openforcefield/utils/toolkits.py", line 1130, in generate_conformers
    raise Exception("OpenEye Omega conformer generation failed")
Exception: OpenEye Omega conformer generation failed
proteneer commented 5 years ago

Benchmark sets can be found here: https://github.com/MobleyLab/benchmarksets/tree/master/input_files

j-wags commented 5 years ago

@SimonBoothroyd is out today, but he had similar problems with conformer generation for ring systems in the property calculator.

Simon eventually resorted to using externally-generated charges for rings using the charge_from_molecules kwarg to create_openmm_system.

jchodera commented 5 years ago

This is an OpenEye toolkit MMFF type assignment issue. We should be able to easily construct a test case where you simply try to generate conformers with Omega and report that to their bug tracker.

proteneer commented 5 years ago

If I already have a starting geometry, why can't I just use that directly to do the AM1 calculations? This is a ring cage so there isn't that much conformational flexibility anyways.

jchodera commented 5 years ago

If I already have a starting geometry, why can't I just use that directly to do the AM1 calculations? This is a ring cage so there isn't that much conformational flexibility anyways.

We generate geometries automatically in an effort to make the force field less dependent on the specific choice of geometry you use. If we did not do this, you could get vastly different behavior simply due to which conformer you picked.

In particular, we use the canonical ELF10 AM1-BCC procedure, which is intended to prevent large geometry-dependent charge fluctuations.

You are supposed to be able to specify your own partial charges by providing a charged molecule as an optional argument to ForceField.create_openmm_system(), but I'm not seeing that argument implemented.

@j-wags : Did this user-specified charge assignment option accidentally fall off the roadmap somehow? This is a critical feature.

andrrizzi commented 5 years ago

I'm not seeing that argument implemented

I think it is implemented as Jeff mentioned, but it is absorbed in the kwargs that is forwarded to one of the handlers, and we probably forgot to include it explicitly in the docstring.

jchodera commented 5 years ago

I think it is implemented as Jeff mentioned, but it is absorbed in the kwargs that is forwarded to one of the handlers, and we probably forgot to include it explicitly in the docstring.

Great! Can you create an issue to fix the docs?

andrrizzi commented 5 years ago

Can you create an issue to fix the docs?

I've already fixed it in #322 .

guoweiqi commented 5 years ago

OEChem also has difficulties generating conformers for very basic charged molecules, like butylammonium (https://github.com/MobleyLab/benchmarksets/blob/master/input_files/cd-set1/sdf/guest-1.sdf)

guoweiqi commented 5 years ago

Looks like using mol2 instead of sdf files works

j-wags commented 5 years ago

I'm finally getting back to this. The SDF in question can be loaded and processed, but only if the atomic formal charges are updated. @guoweiqi already figured this out, and opened an Issue + PR at https://github.com/MobleyLab/benchmarksets/issues/68 and https://github.com/MobleyLab/benchmarksets/pull/69.

I think it's reasonable to say that this issue is closed, and the root cause is that the underlying data may be incorrect. Discussion on that should continue on the relevant PRs in benchmarksets