GilsonLabUCSD / pAPRika

Advanced toolkit for binding free energy calculations
BSD 3-Clause "New" or "Revised" License
31 stars 14 forks source link

Tleap not work properly #165

Closed rehan1988 closed 1 year ago

rehan1988 commented 3 years ago

If I ran the tleap individually with in paprika environment its works fine, but in paprika.py script it cause problem. In implicit solvent I am able to run pakrika but in it I ran the tleap seprate to generate models. In explicit solvent i have to solvate the windows and Now I failing. Please guide me why tleap not work fine in this script.

error is below

Traceback (most recent call last): File "/home/cmadrid/umbrella/EC/pAPRika2.py", line 268, in system.build() File "/home/cmadrid/anaconda3/envs/paprika/lib/python3.9/site-packages/paprika/build/system/tleap.py", line 432, in build self.solvate() File "/home/cmadrid/anaconda3/envs/paprika/lib/python3.9/site-packages/paprika/build/system/tleap.py", line 702, in solvate waters = self.count_waters() File "/home/cmadrid/anaconda3/envs/paprika/lib/python3.9/site-packages/paprika/build/system/tleap.py", line 792, in count_waters waters = self.count_residues()["WAT"] File "/home/cmadrid/anaconda3/envs/paprika/lib/python3.9/site-packages/paprika/build/system/tleap.py", line 830, in count_residues raise Exception( Exception: tleap was unable to successfully create the system after 10 attempts. Investigate the leap.log for errors.

script

Create AMBER coordinate

from paprika.build.system import TLeap

system = TLeap()

system.out_path = "complex"

system.pbc_type = None

system.neutralize = False

system.template_line = [

"source leaprc.protein.ff19SB"

"source leaprc.gaff",

"loadamberparams lig.am1bcc.frcmod",

"lig = loadmol2 ligand.mol2",

"model = loadpdb complex.pdb",

"check model",

"savepdb model gas.complex.pdb",

"saveamberparm model gas.complex.prmtop gas.complex.rst7"

]

system.build()

Align the pulling axis

import parmed as pmd structure = pmd.load_file("complex/gas.complex.prmtop", "complex/gas.complex.rst7", structure=True) from paprika.build import align

from paprika import align

G1 = ":4CT@C9" G2 = ":4CT@C5'" aligned_structure = align.zalign(structure, G1, G2,) aligned_structure.save("complex/aligned.prmtop", overwrite=True) aligned_structure.save("complex/aligned.rst7", overwrite=True)

Add dummy atoms

from paprika.build import dummy structure = pmd.load_file("complex/aligned.prmtop", "complex/aligned.rst7", structure=True) structure = dummy.add_dummy(structure, residue_name="DM1", z=-6.0) structure = dummy.add_dummy(structure, residue_name="DM2", z=-9.0) structure = dummy.add_dummy(structure, residue_name="DM3", z=-11.2, y=2.2) structure.save("complex/aligned_with_dummy.prmtop", overwrite=True) structure.save("complex/aligned_with_dummy.rst7", overwrite=True) structure.save("complex/aligned_with_dummy.pdb", overwrite=True) dummy.write_dummy_frcmod(filepath="complex/dummy.frcmod") dummy.write_dummy_mol2(residue_name="DM1", filepath="complex/dm1.mol2") dummy.write_dummy_mol2(residue_name="DM2", filepath="complex/dm2.mol2") dummy.write_dummy_mol2(residue_name="DM3", filepath="complex/dm3.mol2")

Put it all together

from paprika.build.system import TLeap system = TLeap() system.output_path = "complex" system.pbc_type = None system.neutralize = False system.template_lines = [ "source leaprc.protein.ff19SB", "source leaprc.gaff", "source leaprc.water.tip3p", "loadamberparams dummy.frcmod", "loadamberparams 4CT.frcmod", "4CT = loadmol2 4CT.mol2", "DM1 = loadmol2 dm1.mol2", "DM2 = loadmol2 dm2.mol2", "DM3 = loadmol2 dm3.mol2", "model = loadpdb aligned_with_dummy.pdb", "check model", "savepdb model complex-dum.pdb", "saveamberparm model complex-dum.prmtop complex-dum.rst7" ]

import os data = "../../../paprika/data/complex"

Determine the number of windows

attach_string = "0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.80 1.00 1.20 1.40 1.60 1.80 2.00 2.40 2.80 3.60 4.00 4.40 5.20 6.00 6.80 7.60 9.20 10.80 12.40 15.60 18.80 22.00 25.20 28.40 34.80 41.20 47.60 53.40 60.40 73.20 86.00 93.00 100" attach_fractions = [float(i) / 100 for i in attach_string.split()] import numpy as np initial_distance = 6.0 pull_distances = np.arange(0.0 + initial_distance, 18.0 + initial_distance, 1.0) release_fractions = []

windows = [len(attach_fractions), len(pull_distances), len(release_fractions)] print(f"There are {windows} windows in this attach-pull-release calculation.")

Add restraints using pAPRika

H1 = ":79@C" H2 = ":199@C" H3 = ":14@C" D1 = ":DM1" D2 = ":DM2" D3 = ":DM3"

Static restraints

from paprika import restraints static_restraints = [] structure = pmd.load_file("complex/complex-dum.prmtop", "complex/complex-dum.rst7", structure = True )

r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1], num_window_list = windows, ref_structure = structure, force_constant = 5.0, amber_index=True)

static_restraints.append(r) r = restraints.static_DAT_restraint(restraint_mask_list = [D2, D1, H1], num_window_list = windows, ref_structure = structure, force_constant = 100.0, amber_index=True)

static_restraints.append(r) r = restraints.static_DAT_restraint(restraint_mask_list = [D3, D2, D1, H1], num_window_list = windows, ref_structure = structure, force_constant = 100.0, amber_index=True)

static_restraints.append(r) r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1, H2], num_window_list = windows, ref_structure = structure, force_constant = 100.0, amber_index=True)

static_restraints.append(r) r = restraints.static_DAT_restraint(restraint_mask_list = [D2, D1, H1, H2], num_window_list = windows, ref_structure = structure, force_constant = 100.0, amber_index=True)

static_restraints.append(r) r = restraints.static_DAT_restraint(restraint_mask_list = [D1, H1, H2, H3], num_window_list = windows, ref_structure = structure, force_constant = 100.0, amber_index=True)

static_restraints.append(r)

guest_restraints = []

r = restraints.DAT_restraint() r.mask1 = D1 r.mask2 = G1 r.topology = structure r.auto_apr = True r.continuous_apr = True r.amber_index = True

r.attach["target"] = 6.0 # Angstroms r.attach["fraction_list"] = attach_fractions r.attach["fc_final"] = 5.0 # kcal/mol/Angstroms**2

r.pull["target_final"] = 24.0 # Angstroms r.pull["num_windows"] = windows[1]

r.initialize() guest_restraints.append(r) r = restraints.DAT_restraint() r.mask1 = D2 r.mask2 = D1 r.mask3 = G1 r.topology = structure r.auto_apr = True r.continuous_apr = True r.amber_index = True

r.attach["target"] = 180.0 # Degrees r.attach["fraction_list"] = attach_fractions r.attach["fc_final"] = 100.0 # kcal/mol/radian**2

r.pull["target_final"] = 180.0 # Degrees r.pull["num_windows"] = windows[1]

r.initialize() guest_restraints.append(r) r = restraints.DAT_restraint() r.mask1 = D1 r.mask2 = G1 r.mask3 = G2 r.topology = structure r.auto_apr = True r.continuous_apr = True r.amber_index = True

r.attach["target"] = 180.0 # Degrees r.attach["fraction_list"] = attach_fractions r.attach["fc_final"] = 100.0 # kcal/mol/radian**2

r.pull["target_final"] = 180.0 # Degrees r.pull["num_windows"] = windows[1]

r.initialize() guest_restraints.append(r)

Create the directory structure

import os from paprika.restraints import create_window_list window_list = create_window_list(guest_restraints) for window in window_list: os.makedirs(f"windows/{window}")

write the restraint information in each window

from paprika.restraints import amber_restraint_line host_guest_restraints = static_restraints + guest_restraints

for window in window_list: with open(f"windows/{window}/disang.rest", "w") as file: for restraint in host_guest_restraints: string = amber_restraint_line(restraint, window) if string is not None: file.write(string)

Save restraints to a JSON file

from paprika.io import save_restraints save_restraints(host_guest_restraints, filepath="windows/restraints.json")

Create the coordinates for each window

import shutil for window in window_list: if window[0] == "a": shutil.copy("complex/complex-dum.prmtop", f"windows/{window}/complex-dum.prmtop") shutil.copy("complex/complex-dum.rst7", f"windows/{window}/complex-dum.rst7") elif window[0] == "p": structure = pmd.load_file("complex/complex-dum.prmtop", "complex/complex-dum.rst7", structure = True) target_difference = guest_restraints[0].phase['pull']['targets'][int(window[1:])] - guest_restraints[0].pull['target_initial'] print(f"In window {window} we will translate the guest {target_difference:0.1f} Angstroms.") for atom in structure.atoms: if atom.residue.name == "4CT": atom.xz += target_difference structure.save(f"windows/{window}/complex-dum.prmtop") structure.save(f"windows/{window}/complex-dum.rst7")

Solvate each window

from paprika.build.system import TLeap for window in window_list: print(f"Solvating system in window {window}.") structure = pmd.load_file(f"windows/{window}/complex-dum.prmtop", f"windows/{window}/complex-dum.rst7")

if not os.path.exists(f"windows/{window}/complex-dum.pdb"):
    structure.save(f"windows/{window}/complex-dum.pdb")

system = TLeap()
system.output_path = os.path.join("windows", window)
system.output_prefix = "complex-dum-sol"

system.target_waters = 3000
system.neutralize = True
system.add_ions = ["Na+", 7]
system.template_lines = [
    "source leaprc.protein.ff19SB",
    "source leaprc.gaff",
    "source leaprc.water.tip3p",
    "loadamberparams ../../complex/dummy.frcmod",
    f"loadamberparams {data}/4CT.frcmod",
    f"4CT = loadmol2 {data}/4CT.mol2",
    "DM1 = loadmol2 ../../complex/dm1.mol2",
    "DM2 = loadmol2 ../../complex/dm2.mol2",
    "DM3 = loadmol2 ../../complex/dm3.mol2",
    "model = loadpdb complex-dum.pdb",
]
system.build()

Simulate

from paprika.simulate import AMBER

import logging from importlib import reload reload(logging)

logger = logging.getLogger() logging.basicConfig( format='%(asctime)s %(message)s', datefmt='%Y-%m-%d %I:%M:%S %p', level=logging.INFO ) for window in window_list: simulation = AMBER() simulation.executable = "/home/cmadrid/amber20/bin/pmemd.cuda" simulation.path = f"windows/{window}/" simulation.prefix = "minimize"

simulation.topology = "complex-dum-sol.prmtop"
simulation.coordinates = "complex-dum-sol.rst7"
simulation.ref = "complex-dum-sol.rst7"
simulation.restraint_file = "disang.rest"

simulation.config_gb_min()
simulation.cntrl["ntr"] = 1
simulation.cntrl["restraint_wt"] = 50.0
simulation.cntrl["restraintmask"] = "'@DUM'"

logging.info(f"Running minimization in window {window}...")
simulation.run()

for window in window_list: simulation = AMBER() simulation.executable = "/home/cmadrid/amber20/bin/pmemd.cuda" simulation.gpu_devices = 0

simulation.path = f"windows/{window}/"
simulation.prefix = "production"

simulation.coordinates = "minimize.rst7"
simulation.ref = "complex-dum-sol.rst7"
simulation.topology = "complex-dum-sol.prmtop"
simulation.restraint_file = "disang.rest"

simulation.config_pbc_md()
simulation.cntrl["ntr"] = 1
simulation.cntrl["restraint_wt"] = 50.0
simulation.cntrl["restraintmask"] = "'@DUM'"
simulation.cntrl["nstlim"] = 50000

logging.info(f"Running production in window {window}...")
simulation.run(overwrite=True)

Analysis

from paprika.io import load_restraints guest_restraints = load_restraints("windows/restraints.json")

from paprika import analysis free_energy = analysis.fe_calc() free_energy.prmtop = "complex-dum-sol.prmtop" free_energy.trajectory = 'production*.nc' free_energy.path = "windows" free_energy.restraint_list = guest_restraints free_energy.collect_data() free_energy.methods = ['ti-block'] free_energy.ti_matrix = "full" free_energy.bootcycles = 1000 free_energy.compute_free_energy()

free_energy.compute_ref_state_work([ guest_restraints[0], guest_restraints[1], None, None, guest_restraints[2], None ])

binding_affinity = -1 * ( free_energy.results["attach"]["ti-block"]["fe"] + \ free_energy.results["pull"]["ti-block"]["fe"] + \ free_energy.results["ref_state_work"]

)

sem = np.sqrt( free_energy.results["attach"]["ti-block"]["sem"]2 + \ free_energy.results["pull"]["ti-block"]["sem"]2 )

print(f"The binding affinity = {binding_affinity:0.2f} +/- {sem:0.2f} kcal/mol")

jeff231li commented 3 years ago

If I ran the tleap individually with in paprika environment its works fine, but in paprika.py script it cause problem. In implicit solvent I am able to run pakrika but in it I ran the tleap seprate to generate models. In explicit solvent i have to solvate the windows and Now I failing.

I'm not sure what you mean with running "tleap individually with in paprika environment". Do you mean running your script in a jupyter notebook? What is in the script paprika.py?

Note: If you want to print the log file for TLeap set this variable to False system.build(clean_files=False). There should be a file called leap.log, this should tell you the reason for the error.

rehan1988 commented 3 years ago

Hi

The error is related solvation step.

leap log files says this

/home/cmadrid/amber20/bin/teLeap: Fatal Error! solvateBox: Bad box clearance. usage: solvateBox [iso] [closeness]

Exiting LEaP: Errors = 1; Warnings = 0; Notes = 0.

tleap.in source leaprc.protein.ff19SB source leaprc.gaff source leaprc.water.tip3p source leaprc.water.tip3p loadamberparams frcmod.tip3p loadamberparams ../../complex/dummy.frcmod loadamberparams /home/cmadrid/umbrella/EC/complex/4CT.frcmod 4CT = loadmol2 /home/cmadrid/umbrella/EC/complex/4CT.mol2 DM1 = loadmol2 ../../complex/dm1.mol2 DM2 = loadmol2 ../../complex/dm2.mol2 DM3 = loadmol2 ../../complex/dm3.mol2 model = loadpdb complex-dum.pdb

solvatebox model TIP3PBOX -8.0 iso addionsrand model Na+ 0 addionsrand model Cl- 0 addionsrand model Na+ 6 addionsrand model Cl- 6 desc model quit

script

Solvate each window

from paprika.build.system import TLeap for window in window_list: print(f"Solvating system in window {window}.") structure = pmd.load_file(f"windows/{window}/complex-dum.prmtop", f"windows/{window}/complex-dum.rst7")

if not os.path.exists(f"windows/{window}/complex-dum.pdb"):
    structure.save(f"windows/{window}/complex-dum.pdb")

system = TLeap()
system.output_path = os.path.join("windows", window)
system.output_prefix = "complex-dum-sol"

system.target_waters = 2000
system.neutralize = True
system.add_ions = ["Na+", 6, "Cl-", 6]
system.template_lines = [
    "source leaprc.protein.ff19SB",
    "source leaprc.gaff",
    "source leaprc.water.tip3p",
    "loadamberparams ../../complex/dummy.frcmod",
    f"loadamberparams {data}/4CT.frcmod",
    f"4CT = loadmol2 {data}/4CT.mol2",
    "DM1 = loadmol2 ../../complex/dm1.mol2",
    "DM2 = loadmol2 ../../complex/dm2.mol2",
    "DM3 = loadmol2 ../../complex/dm3.mol2",
    "model = loadpdb complex-dum.pdb",
]
system.build()

Kindly guide. Thanks

jeff231li commented 3 years ago

I think the problem here is that 2000 water molecules is not enough to solvate your protein-ligand system. This is a good amount of host-guest systems but not for protein-ligand systems. Increase the number of water molecules to an appropriate amount and this should solve the issue.

rehan1988 commented 3 years ago

I tried 3000 but same error, do you have any idea how many water molecules we can add for receptor ligand structure. Or what people use for this kind of systems, I mean is there any standard for it.

rehan1988 commented 3 years ago

its works for 6000.

jeff231li commented 3 years ago

That would depend on how big the system is and I think 6000 is still too small. You can solve the system with buffer_value instead of target_waters and see how many water molecules there are in the system. The code below solvates the system with a 12.0 Angstrom padding in all directions. Once you know the number of water molecules (you can round it to the nearest 100) you solvate each APR window with the same amount of water molecules with target_waters.

system = TLeap()
system.output_path = os.path.join("windows", window)
system.output_prefix = "complex-dum-sol"

system.buffer_value = 12.0
system.neutralize = True
system.add_ions = ["Na+", 6, "Cl-", 6]
system.template_lines = [
    "source leaprc.protein.ff19SB",
    "source leaprc.gaff",
    "source leaprc.water.tip3p",
    "loadamberparams ../../complex/dummy.frcmod",
    f"loadamberparams {data}/4CT.frcmod",
    f"4CT = loadmol2 {data}/4CT.mol2",
    "DM1 = loadmol2 ../../complex/dm1.mol2",
    "DM2 = loadmol2 ../../complex/dm2.mol2",
    "DM3 = loadmol2 ../../complex/dm3.mol2",
    "model = loadpdb complex-dum.pdb",
]
system.build()python
rehan1988 commented 3 years ago

Thank you it works but now fail with this error. I am also checking the previous minimization files all out files completed normally. Is there something I am missing.

Traceback (most recent call last): File "/home/cmadrid/umbrella/EC/pAPRika2.py", line 322, in simulation.run(overwrite=True) File "/home/cmadrid/anaconda3/envs/paprika/lib/python3.9/site-packages/paprika/simulate/amber.py", line 671, in run raise Exception( Exception: Exiting due to failed simulation! Check logging info.

Please also guide me can I increase the number of gpus in the code I am using only one gpu. simulation.gpu_devices = 0

jeff231li commented 3 years ago

Traceback (most recent call last): File "/home/cmadrid/umbrella/EC/pAPRika2.py", line 322, in simulation.run(overwrite=True) File "/home/cmadrid/anaconda3/envs/paprika/lib/python3.9/site-packages/paprika/simulate/amber.py", line 671, in run raise Exception( Exception: Exiting due to failed simulation! Check logging info.

Can you check what the error is in the output file from the simulation?

Please also guide me can I increase the number of gpus in the code I am using only one gpu. simulation.gpu_devices = 0

To add more than 1 GPU you need to specify the GPU IDs as a string like "0,1" or "0,1,2,3". NOTE: I doubt you will get a good speedup using multiple GPUs in Amber (and OpenMM) they're optimized for single GPUs. Multiple GPUs are utilized in MD programs like Gromacs and NAMD and you can get good speedup with these.

rehan1988 commented 3 years ago

minimization files ends fine but production.out file says ERROR: Box parameters not found in inpcrd file!

Where we need to define the box parameter again my production.in file is this PBC MD Simulation &cntrl imin = 0, ntx = 1, irest = 0, maxcyc = 0, ncyc = 0, dt = 0.002, nstlim = 50000, ntpr = 500, ntwe = 500, ntwr = 5000, ntwx = 500, ntxo = 1, ioutfm = 1, ntf = 2, ntc = 2, cut = 9.0, igb = 0, tempi = 298.15, temp0 = 298.15, pres0 = 1.01325, ntt = 3, gamma_ln = 1.0, ig = -1, ntp = 1, barostat = 2, ntr = 1, restraint_wt = 50.0, restraintmask = '@DUM', nmropt = 1, pencut = -1, iwrap = 1, ntb = 2, / &wt type = 'END', / DISANG = disang.rest LISTOUT = POUT

Regards

jeff231li commented 3 years ago

ERROR: Box parameters not found in inpcrd file!

When you ask TLeap to solvate it should write the box parameters automatically at the end of the *.rst7 file. Did you add enough water molecules to the system?

rehan1988 commented 3 years ago

I tried

system.buffer_value = 12.0 system.neutralize = True

and this

system.target_waters = 2000 system.neutralize = True system.add_ions = ["Na+", 6, "Cl-", 6]

the minimization work fine but production stop.

Do i need to add more water atoms rather then 6000. Thanks for you time. system.add_ions = ["Na+", 6, "Cl-", 6]

and this

rehan1988 commented 3 years ago

I ran it for 6000 not for 2000 system.target_waters = 6000

slochower commented 3 years ago

Hi @rehan1988, we can help with the code and also advise you on some guidelines for the calculation, but only you will be able to look at your protein-ligand complex and decide if 6000 waters is sufficient. You will want to have enough water surrounding your protein that when you pull the ligand away from the binding site, it will be a bulk-like environment. It seems possible that with too few waters, the periodic boundary conditions are not setup correctly and this is what's happening, but I cannot tell for sure. Either the periodic boundary conditions are lost after the initial solvation step (you should be able to check this by looking at the last line of the coordinate file after solvation) or after minimization/equilibration.

rehan1988 commented 3 years ago

HI Thank you for the solution, It is working with sander but stop with pemed.cuda after 1ps. But with sander it is extremely slow could you tell me any solution so that I run this calculations with gpu.

Could you please guide me if i just want to pull the guest out from host from 1-6A away not 18A, with the fraction of .2 to .4

like p1= .2, p2= .4, p3=.6 etc (ligand moves away) how I can do this in these commands

initial_distance = 2.0 pull_distances = np.arange(0.0 + initial_distance, 18.0 + initial_distance, 1.0) release_fractions = []

windows = [len(attach_fractions), len(pull_distances), len(release_fractions)]

Best

jeff231li commented 3 years ago

You specify the pull distances using a Numpy array, the example in the README file uses the arange function but you can use other Numpy functions (like linspace). Take a look at the Numpy documentation for the arange function.

If you use the GPU version of Amber (pmemd.cuda) your simulations will crash (too large volume change) if the periodic box has not relaxed to an equilibrium state. To overcome this issue, run a short MD with sander or pmemd.MPI, then switch to pmemd.cuda.

jeff231li commented 1 year ago

Closing, I assume you have the simulations working now. Feel free to reopen if you're still running into problems.