Closed casebeerVRTX closed 2 years ago
There are a number of things that could be going wrong here, including apparently good convergence that is actually poor.
We're not yet to the advanced state with perses as we were with YANK in being able to automatically generate "simulation health reports" that make it easy to diagnose issues without disclosing chemical information about the transformation, so we'll need to ask you to find a public example that you can share your input files for. Can you upload a ZIP file and also provide a conda env export
dump of your conda environment?
Our funding for this project is from other partners that we have to prioritize, but we try to give a "best effort" in helping other folks out when we can.
Thank you! We are working on putting something together.
Awesome---that will help immensely.
@mikemhenry @ijpulidos : We should add some A->B and B->A consistency tests (likely just ligand in solvent, not complex) to our GPU CI as well. We can start with examples we already have, but can add a public example from @casebeerVRTX once provided.
We appreciate the help and we have been trying on some public data so it will be easier to debug. However, we've made a few modifications to perses
.
To run perses
on membrane proteins, we worked with @dominicrufa last summer. We prepared the protein + membrane system outside of perses
and then used modeller
to add waters and ions. After this, we saved a pickle of the topology and numpy
array of the positions.
The examples provided are from the following paper, a Nav 1.7 protein with Arylsulfonamide inhibitors.
from simtk.openmm.app import PDBFile
from simtk.openmm import app
from simtk import unit
from simtk.openmm import Platform
from simtk.openmm import LangevinIntegrator
from simtk.openmm.app import ForceField, PDBFile, PME, Modeller
# This is receptor + membrane
pdb = PDBFile("tmp2.pdb")
modeller = Modeller(pdb.topology, pdb.positions)
forcefield = ForceField(*['amber/ff14SB.xml', 'amber/tip3p_standard.xml', 'amber/lipid17.xml'])
# This works with the included POPC
system = forcefield.createSystem(pdb.getTopology(), nonbondedMethod=PME)
modeller.addSolvent(forcefield, model='tip3p', numAdded=65000,
ionicStrength=0.150*unit.molar, positiveIon='Na+', negativeIon='Cl-', neutralize=True)
def write_pickle(object, pickle_filename):
"""
write a pickle
arguments
object : object
picklable object
pickle_filename : str
name of pickle
"""
import pickle
with open(pickle_filename, 'wb') as f:
pickle.dump(object, f)
write_pickle(modeller.topology, "openmm.pickle")
_positions = modeller.positions / unit.nanometers
import numpy as np
np.savez("positions.z", positions=_positions)
We then modified relative_setup
to omit solvation and added the class attributes for receptor_topology
and receptor_positions
to the RelativeFEPSetup
class. If those are defined, we depickle the files we prepared outside of perses
and set self._receptor_positions_old
, self._receptor_topology_old
, and self._receptor_md_topology_old
.
def depickle(pickle_filename):
"""
load a pickle
arguments
pickle_filename : str
name of pickle
returns
pickle : loaded pickle object
"""
import pickle
with open(pickle_filename, 'rb') as f:
pickle = pickle.load(f)
return pickle
Inside of setup_relative_calculation.py
we added an option to use the Monte Carlo barostat. We also added options to directly pass the receptor topology and receptor positions to RelativeFEPSetup
.
(We have not merged in new commits to perses
so we are out of date ~6 months.)
In run.yaml
we're loading the Lipid17 FF to handle POPC. Other than that, I think things are fairly standard.
Attached are the conda environment, receptor PDB, ligand sdf, pickle, numpy positions, yaml file, and run.py.
To make things a little more concrete and fix some copy/paste indentation issues, here is a diff
between perses
commit ac5cd56e138383b557ecfd5d0fc780bef193ec0d and our modified version.
Hello, We have continued to try to debug this issue, and unfortunately have not had much luck. We have tried changing the number of REPEX cycles, the number of MD steps, the number of lambda windows, adding a backbone restraint, plotting replica mixing, doing a visual trajectory analysis, and examining the DDG as a function of time.
I am still working with the protein and ligands specified in the sdf file that I sent earlier. My expectation is that any two reciprocal transformations would sum to 0. I’m looking at a particular transformation (9->10 and 10->9), one of many that does not sum to zero.
Here is a summary of a bunch of things I tried:
This is the 10 -> 9 transformation.
It appears as if none of these changes resulted in significant progress towards getting the two transformations to sum to zero. Below is a graph of all the DDG results along with the experimental values on the far right.
Looking more closely at simulation 9, which has 1000 REPEX cycles, 1000 MD steps, 4 fs timesteps, and 11 lambda windows, meaning 44ns of simulation time per transformation, I plotted the replica mixing and the DDG over time. Here is the replica mixing (complex on left and solvent on right) for 10 -> 9.
Here is the replica mixing (complex on left and solvent on right) for 9 -> 10.
Qualitatively, they seem to be mixing.
Here is a plot of the ddG over time of the 10 -> 9 and the 9 -> 10 transformation. Again, qualitatively, it seems to be close to equilibrium.
We also tried to run this protein without the lipids (just in solvent) using the non-lipid modified perses. Even though we would not expect the predicted values to match the experimental values because there’s lipids near the binding site, we wanted to test if the two reciprocal transformations would sum to 0. At the moment, we are getting the following error:
AssertionError: the difference between the atoms_with_positions_reduced_potential and the sum of atoms_with_positions_reduced_potential_components is 38.51413189283903
I realize that debugging the lipid-modified perses may pose some challenges, but I was hoping you might be able to suggest what else to check as we’re working on this issue.
I wanted to follow up with this issue since it is now resolved. We ended up rebasing the lipid-version of perses to the most recent version, and changed the atom mapping strategy because the same ring was getting mapped incorrectly. This doesn't solve the issue of both of these being positive, but we were able to find a work around.
Thank you!
We are trying to troubleshoot some issues using perses on some ligand transformations. We're testing a single forward and backward transformation ( A-> B and B -> A) which is the most trivial closed cycle. Our intuition tells us that the magnitudes of these two transformations should be equal and opposite to each other, and this is what we see with the experimental data. Right now, the calculated ddGs are not adding up to zero, but instead are both very positive. Can you help shed some light on this? Could this be a units problem? This is what the mm_graph looks like:
OutEdgeDataView([(9, 10, {'calc_DDG': 5.18712727908637, 'calc_dDDG': 0.4801874381154462, 'exp_DDG': -0.9193053246474641, 'exp_dDDG': 0.0}), (10, 9, {'calc_DDG': 11.124917041483098, 'calc_dDDG': 0.4120190004511572, 'exp_DDG': 0.9193053246474641, 'exp_dDDG': 0.0})])