michellab / BioSimSpace

Code and resources for the EPSRC BioSimSpace project.
https://biosimspace.org
GNU General Public License v3.0
77 stars 19 forks source link

Absolute Binding Free Energy Calculations with GROMACS #414

Closed fjclark closed 1 year ago

fjclark commented 1 year ago

Hello,

I'm attempting to run an absolute binding free energy calculation using GROMACS using the functionality in the Exscientia sandpit. However, the mdp files generated contained the line couple-moltype =, which should have been couple-moltype = LIG. mol._sire_object.name().value() was returning an empty string here.

I manually changed all the mdp files to contain couple-moltype = LIG, but the decoupling lambda windows for all simulations still produced no free energy changes (as checked by running BAR and checking the gradients with respect to coul-lambdas and vdw-lambdas). This was not true for the bonded-lambda windows, where the Boresch restraints were introduced - there was a free energy change / gradients with respect to bonded-lambdas was non-zero, as expected.

I set up the calculations with:

import BioSimSpace.Sandpit.Exscientia as BSS
import pandas as pd

# Import required units
from BioSimSpace.Sandpit.Exscientia.Units.Length import angstrom
from BioSimSpace.Sandpit.Exscientia.Units.Angle import radian, degree
from BioSimSpace.Sandpit.Exscientia.Units.Energy import kcal_per_mol
from BioSimSpace.Sandpit.Exscientia.Units.Temperature import kelvin

# Load the system and mark the ligand to be decoupled
system = BSS.IO.readMolecules(["input/SYSTEM.top", "input/SYSTEM.crd"])
lig = BSS.Align.decouple(system[0])
print(lig)
system.updateMolecule(0,lig)
print(system.nDecoupledMolecules())

# The original SOMD restraints:
#boresch restraints dictionary = {"anchor_points":{"r1":4946, "r2":4944, "r3":4949, "l1":11, "l2":2, "l3":3},
#"equilibrium_values":{"r0":5.92, "thetaA0":1.85, "thetaB0":1.59,"phiA0":-0.30, "phiB0":-1.55, "phiC0":2.90},
#"force_constants":{"kr":25.49, "kthetaA":66.74, "kthetaB":38.39, "kphiA":215.36, "kphiB":49.23, "kphiC":49.79}}

# Assign three atoms from the protein
r1 = system.getAtom(4946)
r2 = system.getAtom(4944)
r3 = system.getAtom(4949)

# Assign three atoms from the ligand
l1 = system.getAtom(11)
l2 = system.getAtom(2)
l3 = system.getAtom(3)

restraint_dict = {
        "anchor_points":{"r1":r1, "r2":r2, "r3":r3,
                         "l1":l1, "l2":l2, "l3":l3},
        "equilibrium_values":{"r0": 5.92 * angstrom,
                              "thetaA0": 1.85 * radian,
                              "thetaB0": 1.59 * radian,
                              "phiA0": -0.30 * radian,
                              "phiB0": -1.55 * radian,
                              "phiC0": 2.90 * radian},
        "force_constants":{"kr": 25.49 * kcal_per_mol / angstrom ** 2,
                           "kthetaA": 66.74 * kcal_per_mol / (radian * radian),
                           "kthetaB": 38.39 * kcal_per_mol / (radian * radian),
                           "kphiA": 215.36 * kcal_per_mol / (radian * radian),
                           "kphiB": 49.23 * kcal_per_mol / (radian * radian),
                           "kphiC": 49.79 * kcal_per_mol / (radian * radian)}}

restraint = BSS.FreeEnergy.Restraint(system, restraint_dict, 298*kelvin)

# Set the lambda schedule. Values for turning on restraints taken from previous work with MIF,
# values for coul and vdw taken from https://www.nature.com/articles/s42004-022-00721-4
gromacs_lam_vals=pd.DataFrame(data={
                "bonded":[0.000, 0.125, 0.250, 0.375, 0.500, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000],
                "coul":  [0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.100, 0.200, 0.300, 0.400, 0.500, 0.600, 0.700, 0.800, 0.900, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000, 1.000],
                "vdw":   [0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.050, 0.100, 0.150, 0.200, 0.250, 0.300, 0.350, 0.400, 0.450, 0.500, 0.550, 0.600, 0.650, 0.700, 0.750, 0.800, 0.850, 0.900, 0.950, 1.000],
                })

# The initial value of lambda
initial_lam=pd.Series(data={'bonded': 0.000, 'coul': 0.000, 'vdw': 0.000})

gromacs_protocol = BSS.Protocol.FreeEnergy(runtime=6*BSS.Units.Time.nanosecond, 
                                          lam=initial_lam,
                                          lam_vals=gromacs_lam_vals, 
                                          perturbation_type="full",
                                          temperature=298*kelvin)

gromacs_fe_calc = BSS.FreeEnergy.Relative(restraint.system, gromacs_protocol, 
                                         engine='gromacs', restraint=restraint, 
                                         work_dir='output',
                                         gpu_support = True, show_errors= True)

The required input, and some example output directories are here.

I'm using BSS version 2023.0.0+174.g8b7d476d with Sire 2023.0.2 on Linux, installed through Conda, with GROMACS 2022.

This is my first time using GROMACS, so there may well be something trivial wrong with my set-up code.

Thanks.

lohedges commented 1 year ago

Hi there.

Have you tried running this using the Exscientia fork? There have been a few one-way synchronisations (from us to them) recently, which have passed all of their internal checks, but there haven't been any synchronisations the other way. I've not written (or tested) any of this part of the codebase, so I'm not the best placed to answer.

However, the mdp files generated contained the line couple-moltype = , which should have been couple-moltype = LIG. mol._sire_object.name().value() was returning an empty string here.

This part of the code is identical in their fork, so presumably the same issue would occur. (Unless the name is being returning an empty string due to some Sire API changes?)

Tagging @xiki-tempula. Feel free to move this issue over to your fork if needed.

fjclark commented 1 year ago

Thanks for the reply. I've just tried with the Exscientia fork and I still get the empty couple-moltype issue.

xiki-tempula commented 1 year ago

If you save the lig = BSS.Align.decouple(system[0]) into a gromacs topology, does it has a name?

fjclark commented 1 year ago

Yes, its name is LIG, as with the input generated above. The end of the topology file shows

[ molecules ]
;molecule name    nr.
           LIG      1

Thanks.

xiki-tempula commented 1 year ago

@fjclark Can you load this topology file back in. Replacing the line system = BSS.IO.readMolecules(["input/SYSTEM.top", "input/SYSTEM.crd"]) and do everything again to see if it works?

fjclark commented 1 year ago

@xiki-tempula that solves the issue. Now couple-moltype = LIG, and the gradients with respect to coul-lambdas and vdw-lambdas are non-zero. Thanks very much!

lohedges commented 1 year ago

Thanks for solving. I'd suggest adding some fallback in case the input was generated via another means, since not all of the parsers apply a molecule name in this way, e.g. I'm not sure if it would always be persistent on save/load depending on the format.