choderalab / yank

An open, extensible Python framework for GPU-accelerated alchemical free energy calculations.
http://getyank.org
MIT License
178 stars 70 forks source link

Cannot set up system #758

Open tbrailov opened 7 years ago

tbrailov commented 7 years ago

I am trying to calculate free energies of binding with Yank between a protease and a short peptide sequence. I initially tried setting up the system with the ligand specified via a pdb file. I realize now that this causes issues (see issue #638). However, I never reached the error that the person who started #638 had because tleap never finished setting up the system. I ran the simulation for 10 hours with 8 processes on Comet, which should have been plenty of time to set up the system.

Now I am trying to set up my system with OpenMM and provide that to YANK directly. I am following the directions here http://getyank.org/0.15.0/yamlpages/systems.html#yaml-systems-user-defined However I am a bit confused as to what phase_1 and phase_2 are. If I am understanding the instructions correctly, phase_1 is the solvated complex of the ligand and the receptor. However what is phase_2? Is it the dissociated complex in solution (i.e. receptor and ligand are no longer bound)? Or is it just the solvent?

Thanks, Tatiana

jchodera commented 7 years ago

because tleap never finished setting up the system. I ran the simulation for 10 hours with 8 processes on Comet, which should have been plenty of time to set up the system.

Can you post a tarball of your input files so we can try to locally replicate your issue?

Now I am trying to set up my system with OpenMM and provide that to YANK directly. I am following the directions here http://getyank.org/0.15.0/yamlpages/systems.html#yaml-systems-user-defined However I am a bit confused as to what phase_1 and phase_2 are. If I am understanding the instructions correctly, phase_1 is the solvated complex of the ligand and the receptor. However what is phase_2? Is it the dissociated complex in solution (i.e. receptor and ligand are no longer bound)? Or is it just the solvent?

If you wish to compute the absolute binding free energy of a ligand by alchemically eliminating the ligand, phase_1 is the complex while phase_2 is the ligand in solution, and the same ligand must be selected as the alchemically-modified region in both phases.

jchodera commented 7 years ago

We should warn you that eliminating peptides should be considered highly experimental at this point, and will likely require a very large number of alchemical states and a huge amount (possibly 1 us/replica or more) of sampling.

Also, be sure you're using the GPU resources of comet, or else this will take forever. We'd be curious to hear about your experiences using YANK on XSEDE resources---we haven't tried this since XSEDE systematically disinvested in GPU computing with the decommissioning of Forge in 2012. It's good to see they've realized that GPUs are actually useful for scientific computing.

tbrailov commented 7 years ago

Here is the tar.gz file with the .yaml file and the two pdb files. tev_sys.tar.gz

Lnaden commented 7 years ago

@tbrailov Thanks for the files. A few things:

What version of YANK are you using? And is it possible to tar up your pre-made systems as well? I'm curious to see if we can rebuild it here as well. I'm having some local issues with the ambertools suite throwing an error, but that might be unrelated to the problem you are seeing.

tbrailov commented 7 years ago

I am using yank 0.16.2 with python 3.6 (on Comet).

I was able to successfully set up phase_1 (complex.pdb in complex_pdb.tar.gz and complex.xml in complex_xml.tar.gz) however I am having issues with setting up phase_2. The system that I set up with the solvated ligand does not have any waters around the ligand. There is just a box of water and then some distance away there is the ligand itself. See files solvated_ligand.pdb and solvated_ligand.xml. I also included the python files I used to set up the system in OpenMM. For phase_1 the file is called setup_nia_substrate_sys.py and for phase_2 the files is called setup_ligand.py. The files are nearly identical just with different .pdb files being loaded in the beginning. For these systems I am using openmm 7.1.1 with python 3.5. (Sorry I have multiple files - github wouldn't let me upload a single archive because it was too big) complex_pdb.tar.gz complex_xml.tar.gz setup_sys.tar.gz

tbrailov commented 7 years ago

By the way - I would also like to add that I had another issue with setting up the systems. When I ran the setup_nia_substrate_sys.py script on Comet, I got the following error:

Traceback (most recent call last):

File "setup_nia_substrate_sys.py", line 18, in

mod.addHydrogens(forcefield)

File "/home/tbrailov/anaconda3/lib/python3.6/site-packages/simtk/openmm/app/modeller.py", line 867, in addHydrogens

system = forcefield.createSystem(newTopology, rigidWater=False, nonbondedMethod=CutoffNonPeriodic)

File "/home/tbrailov/anaconda3/lib/python3.6/site-packages/simtk/openmm/app/forcefield.py", line 1077, in createSystem

raise ValueError('No template found for residue %d (%s).  %s' % (res.index+1, res.name, _findMatchErrors(self, res)))

ValueError: No template found for residue 587 (GLU). The set of atoms matches GLU, but the bonds are different.

However there is no problem when I run it on my computer. I suspect it may be the difference in python versions. I have py3.5 on my computer, while I have py3.6 on Comet.

Thanks, Tatiana

jchodera commented 7 years ago

@Lnaden : Would be good to look into these issues.

tbrailov commented 7 years ago

I was actually able to resolve the solvation of the ligand issue in a peculiar way. The crystal structure of the protein:ligand complex had some waters around the active site which I originally did not include in the substrate.pdb file. I tried including those waters with the ligand and running the setup_ligand.py with the new .pdb file with the waters included and I was able to put a solvation box around the ligand. This does not seem like a very intuitive fix, not sure why it worked. Here is the new substrate.pdb file and the solvated ligand. working_solvation.tar.gz