choderalab / perses

Experiments with expanded ensembles to explore chemical space
http://perses.readthedocs.io
MIT License
178 stars 50 forks source link

Mapping/parameterization issue #1161

Open viktorbelay opened 1 year ago

viktorbelay commented 1 year ago

I am trying to use the Perses CLI to calculate free energies for nucleotide binding to the IP3 receptor and am using ATP <-> AMP as a test run. I've been able to successfully run Perses for the Tyk2 example (https://github.com/choderalab/perses/tree/main/examples/protein-ligand/cli), but run into an error when trying it for my system:

2023-02-06 15:54:36,881:(22.83s):openff.toolkit.typing.engines.smirnoff.parameters:Finding matches for ToolkitAM1BCCHandler
2023-02-06 15:54:36,881:(0.00s):openff.toolkit.typing.engines.smirnoff.parameters:0 matches identified
2023-02-06 15:54:40,299:(3.42s):proposal_generator:determining atom map between old and new molecules...
2023-02-06 15:54:40,299:(0.00s):proposal_generator:the atom map is not specified; proceeding to generate an atom map...
2023-02-06 15:54:40,324:(0.02s):proposal_generator:Using MCSS to derive mapping...
Traceback (most recent call last):
  File "/home/belayv/miniconda3/envs/perses-viktor/bin/perses-cli", line 10, in <module>
    sys.exit(cli())
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/perses/app/cli.py", line 108, in cli
    run(yaml_filename=yaml, override_string=override)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/perses/app/setup_relative_calculation.py", line 812, in run
    setup_dict = run_setup(setup_options)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/perses/app/setup_relative_calculation.py", line 515, in run_setup
    fe_setup = RelativeFEPSetup(ligand_file, old_ligand_index, new_ligand_index, forcefield_files,phases=phases,
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/perses/app/relative_setup.py", line 387, in __init__
    self._complex_topology_proposal = self._proposal_engine.propose(self._complex_system_old_solvated,
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/perses/rjmc/topology_proposal.py", line 2122, in propose
    atom_mapping = atom_mapper.get_best_mapping(self.current_molecule, self.proposed_molecule)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/perses/rjmc/atom_mapping.py", line 993, in get_best_mapping
    atom_mappings = self.get_all_mappings(old_mol, new_mol)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/perses/rjmc/atom_mapping.py", line 796, in get_all_mappings
    old_offmol, old_oemol = copy_molecule(old_mol)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/perses/rjmc/atom_mapping.py", line 789, in copy_molecule
    offmol = Molecule(mol)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/openff/toolkit/topology/molecule.py", line 5836, in __init__
    super(Molecule, self).__init__(*args, **kwargs)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/openff/toolkit/topology/molecule.py", line 2089, in __init__
    result = toolkit_registry.call(
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/openff/toolkit/utils/toolkit_registry.py", line 366, in call
    raise e
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/openff/toolkit/utils/toolkit_registry.py", line 362, in call
    return method(*args, **kwargs)
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/openff/toolkit/utils/openeye_wrapper.py", line 238, in from_object
    return self.from_openeye(
  File "/home/belayv/miniconda3/envs/perses-viktor/lib/python3.10/site-packages/openff/toolkit/utils/openeye_wrapper.py", line 950, in from_openeye
    raise UndefinedStereochemistryError(msg)
openff.toolkit.utils.exceptions.UndefinedStereochemistryError: Unable to make OFFMol from OEMol: OEMol has unspecified stereochemistry. oemol.GetTitle(): MOL
Problematic atoms are:
Atom atomic num: 15, name: P1, idx: 0, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 8, name: O2, idx: 18, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 8, name: O3, idx: 19, aromatic: False, chiral: False
bond order: 2, chiral: False to atom atomic num: 8, name: O4, idx: 20, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 8, name: O5, idx: 21, aromatic: False, chiral: False
Atom atomic num: 15, name: P2, idx: 1, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 8, name: O5, idx: 21, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 8, name: O6, idx: 22, aromatic: False, chiral: False
bond order: 2, chiral: False to atom atomic num: 8, name: O7, idx: 23, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 8, name: O8, idx: 24, aromatic: False, chiral: False
Atom atomic num: 15, name: P3, idx: 2, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 8, name: O8, idx: 24, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 8, name: O9, idx: 25, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 8, name: O10, idx: 26, aromatic: False, chiral: False
bond order: 2, chiral: False to atom atomic num: 8, name: O11, idx: 27, aromatic: False, chiral: False
Atom atomic num: 6, name: C6, idx: 12, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 7, name: N3, idx: 9, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C7, idx: 13, aromatic: False, chiral: True
bond order: 1, chiral: False to atom atomic num: 8, name: O1, idx: 16, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 1, name: H3, idx: 33, aromatic: False, chiral: False
Atom atomic num: 6, name: C7, idx: 13, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C6, idx: 12, aromatic: False, chiral: True
bond order: 1, chiral: False to atom atomic num: 6, name: C8, idx: 14, aromatic: False, chiral: True
bond order: 1, chiral: False to atom atomic num: 8, name: O13, idx: 29, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 1, name: H4, idx: 34, aromatic: False, chiral: False
Atom atomic num: 6, name: C8, idx: 14, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C7, idx: 13, aromatic: False, chiral: True
bond order: 1, chiral: False to atom atomic num: 6, name: C9, idx: 15, aromatic: False, chiral: True
bond order: 1, chiral: False to atom atomic num: 8, name: O12, idx: 28, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 1, name: H5, idx: 35, aromatic: False, chiral: False
Atom atomic num: 6, name: C9, idx: 15, aromatic: False, chiral: True with bonds:
bond order: 1, chiral: False to atom atomic num: 6, name: C8, idx: 14, aromatic: False, chiral: True
bond order: 1, chiral: False to atom atomic num: 8, name: O1, idx: 16, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 6, name: C10, idx: 17, aromatic: False, chiral: False
bond order: 1, chiral: False to atom atomic num: 1, name: H6, idx: 36, aromatic: False, chiral: False

When I use AtomMapper directly on the SDF I'm using, I get similar warnings. However, it seems to be able to actually produce the mappings:

image image

I am wondering then if this could be a problem with parameterization. I have attached the full run log, my input and output files, and the Conda environment I am using. The command I'm running is:

LOGLEVEL=DEBUG perses-cli --yaml protein-ligand.yaml --platform CUDA

Attached files: inputs.zip outputs.zip log.zip environment.yml.zip

mikemhenry commented 1 year ago

Thanks for the very detailed bug report @viktorbelay It looks like the code path for using the mapper directly is slightly different from the path we take when using the CLI. I will investigate this.

mikemhenry commented 1 year ago

@viktorbelay Thank you again for the report!

When I use AtomMapper directly on the SDF I'm using, I get similar warnings. However, it seems to be able to actually produce the mappings:

Can you post the code you used to do that? I can reproduce your error, but I can't reproduce your success.

from perses.rjmc.atom_mapping import AtomMapper
from openeye import oechem

atom_mapper = AtomMapper()

ifs = oechem.oemolistream()
ifs.open("ligand_aligned.sdf")
list_of_oemols = []

for mol in ifs.GetOEGraphMols():
    list_of_oemols.append(oechem.OEGraphMol(mol))

current_molecule, proposed_molecule = list_of_oemols[0], list_of_oemols[1]

# Create an atom mapping for the default criteria
atom_mapping = atom_mapper.get_best_mapping(current_molecule, proposed_molecule)

Gives me the same error as running the whole pipeline LOGLEVEL=DEBUG perses-cli --yaml protein-ligand.yaml --platform CUDA

viktorbelay commented 1 year ago

@mikemhenry Thank you for helping with this!

Strange, here is my code:

from openff.toolkit.topology import Molecule
from pkg_resources import resource_filename
from perses.rjmc.atom_mapping import AtomMapper

# Define path to SDF and load molecules
sdf_file='/data/chodera/viktor/IP3R_MD/perses_test2/ligand_aligned.sdf'
mol=Molecule.from_file(sdf_file, allow_undefined_stereo=True)

# Display molecules to verify they are loaded correctly
display(mol[0].visualize(backend='rdkit')) 
display(mol[1].visualize(backend='rdkit'))

# Create a default AtomMapper
atom_mapper = AtomMapper()

# Do mapping from AMP to ATP
atom_mapping = atom_mapper.get_best_mapping(mol[1], mol[0])
display(atom_mapping)

# Do mapping from ATP to AMP
atom_mapping = atom_mapper.get_best_mapping(mol[1], mol[0])
display(atom_mapping)

I used the same environment to run this as the one I attached in my original post. I guess the difference is that you're using oechem to load the molecules whereas I used OpenFF. Could the issue be in oechem?

I'm also attaching a Jupyter notebook where I displayed the mappings: atom_mapping copy.ipynb.zip

mikemhenry commented 1 year ago

@viktorbelay I can reproduce your success as well now. What is weird is when I change this line mol=Molecule.from_file(sdf_file, allow_undefined_stereo=True) to mol=Molecule.from_file(sdf_file, allow_undefined_stereo=False) it still "works" but throws warnings about the stero chemistry being undefined.

I'm going to try and figure out why we are getting different results.

mikemhenry commented 1 year ago

The issue boils down to this function: https://github.com/choderalab/perses/blob/main/perses/utils/openeye.py#L323 Where setting allow_undefined_stereo=True actually causes us to not perceive the stereochemistry from the SDF, which is very different behavior than allowing undefined stereochemistry

What we want to do is have this code block always run but if allow_undefined_stereo=True, we don't raise an error on these functions but instead continue. Currently this block doesn't run at all which is an issue since when reading in a SDF file, we always want to try oechem.OE3DToInternalStereo

# perceive chirality
if not allow_undefined_stereo:
    assert oechem.OE3DToInternalStereo(molecule), f"the stereochemistry perception from 3D coordinates failed"
    assert not has_undefined_stereocenters(molecule), f"there is an atom with an undefined stereochemistry"
ijpulidos commented 1 year ago

@mikemhenry We discussed this yesterday on the devs meeting and I think I finally understand what you are trying to say, I think you are correct in suggesting that we should be executing the OE3DToInternalStereo from openeye regardless if the stereochemistry is defined or not.

There are some subtleties that will probably deserve some more discussion, because as far as the discussion went, there are molecules that could have undefined stereochemistry even if the bond orders and 3D coordinates are known. We would need to think how to deal with these, or rather, how the tools we are using deal with these cases.

mikemhenry commented 1 year ago

@ijpulidos I think during our refactor, we should try and see if we can use openff-toolkit as the entry point for loading in small molecules, then call to_rdkit or to_openeye as needed, since then we can leverage all the work they have put in with stereochemistry validation and perception without us having to re-invent the wheel.

ijpulidos commented 1 year ago

@ijpulidos I think during our refactor, we should try and see if we can use openff-toolkit as the entry point for loading in small molecules, then call to_rdkit or to_openeye as needed, since then we can leverage all the work they have put in with stereochemistry validation and perception without us having to re-invent the wheel.

Yeah, I still think we would like to have a workaround or semi-fix for this release, I'm just still not sure what's the best approach but maybe we just want to call OE3DToInternalStereo when we read from SDF, avoiding changing more things elsewhere that may have unexpected consequences. For example, check the changes in https://github.com/choderalab/perses/commit/bdbc8b3fccc6f7a0e49c2641abbab1d0e4ff8f8f (this is a branch I was experimenting with for these changes, they currently work for @viktorbelay simulation).

Does that sound okay? We will just have to check that the unit tests and benchmarks are fine with that.

mikemhenry commented 1 year ago

@ijpulidos Sure, that looks cleaner than what I did here https://github.com/choderalab/perses/pull/1167/files