MobleyLab / benchmarksets

Benchmark sets for binding free energy calculations: Perpetual review paper, discussion, datasets, and standards
BSD 3-Clause "New" or "Revised" License
39 stars 16 forks source link

CD mol2 files coordinates starting in the binding pocket #41

Open andrrizzi opened 7 years ago

andrrizzi commented 7 years ago

Hi all, we'd like to use the CD input files to run YANK calculations. In particular, we'd like to start from the .mol2 files currently in the nieldev branch to prepare our solvation boxes in TIP4P-EW waters. A couple of questions (tagging @nhenriksen who is working the branch):

  1. Could you confirm that the mol2 files already have the same protonation state/charges that were used in the reference calculation?
  2. Would it be possible to have the coordinates in the mol2 to be the same as the final rst7 file so that the guest will be in the binding pocket? I can work on this myself in case you don't have time but you are still interested. I'll have to do it anyway in the next couple of days to set up my simulations.
nhenriksen commented 7 years ago

Hi Andrea. Yes, the mol2 files have the expected protonation based on pH 6.9, which was used experimentally. I agree that it would be better to have the mol2 coordinates match the rst7 coordinates. I need to focus on finishing up edits I'm making to the benchmarksets paper first, but I could work on this later this week. If you get to it sooner than that, we can use your files.

andrrizzi commented 7 years ago

Ok, thanks! I'll let you know if/when I'll get to this so that we won't both work on the same thing.

andrrizzi commented 7 years ago

@nhenriksen just a heads up that I'll work on the mol2 coordinates this afternoon. I'll fork the repo and open a PR to the nieldev branch when I'm done.

andrrizzi commented 7 years ago

Question: what is the difference between the -p and -s version of the prmtop and rst7 files (apologies if it's written somewhere and I've missed it)? And is there one I should prefer for the single-molecule files coordinates? I've noticed that the following files in cd-set1/ are missing a rst7 companion: acd-s15-s.prmtop, acd-s11-p.prmtop, acd-s19-p.prmtop, acd-s21-p.prmtop.

I've also noticed that the pdb and sd files have the same coordinates of the mol2. If you agree, @nhenriksen, I'd change the coordinates of those files too to make them consistent.

nhenriksen commented 7 years ago

I have a brief comment in the README, although maybe it needs more explanation: " To account for the two possible orientations of the guest within the CD cavity, simulation files with the '-p' suffix indicate that the guest is bound with the polar functional group oriented out of the primary (narrow) face of the CD, whereas the '-s' suffix indicates the guest polar functional group is oriented out of the secondary (wider) face of the CD."

Basically, the host in asymmetric, so there are two obvious orientations for a polar guest to bind.

I'll look into the issue with the missing rst7s ....

I agree that we should update all the coordinates.

andrrizzi commented 7 years ago

I have a brief comment in the README

Ah! Sorry, I completely missed it, thanks! I'll arbitrarily pick all the s as a reference then unless you have a reason to prefer the other (or to pick them 50-50).

nhenriksen commented 7 years ago

From experimental literature, the default expectation is that most guests will prefer the "-s" orientation, although there is not a whole lot of data to support that. We've been doing some NMR in our lab that paints a complicated picture about guest orientation, and for simulations, the ammoniums definitely prefer the "-p" orientation for some force fields. So both orientations will want to be considered ultimately. But for the purposes of parameters, etc., the "-s" orientation is fine.

davidlmobley commented 7 years ago

I just want to thank you guys for the great work and dialogue here. This is exactly the sort of thing I hoped getting more of these types of systems up on GitHub would nucleate -- the ability to reuse them, with knowledge about what should happen/what's important, without having to read 20 papers on the topic. :)

nhenriksen commented 7 years ago

The setup simulations for those missing files had an error that evaded my detectors. I've added them now. daa8b2e74f7bdada764fe97cd64e06756d19a378

andrrizzi commented 7 years ago

I've added them now

Thanks!

Since the host's conformations of the rst7 files after equilibration are all slightly different, I was thinking about using structural alignment to get the new mol2 positions. Would this be ok? I'm not sure I can do much better without the docked poses.

nhenriksen commented 7 years ago

Sounds good!

davidlmobley commented 7 years ago

@andrrizzi - have you gotten around to generating "ligand in solution" files for these yet as well? One of my rotation students needs these and I'm trying to sort out whether she should do that (and probably deposit here) or whether that would duplicate work you are already doing for Yank.

andrrizzi commented 7 years ago

have you gotten around to generating "ligand in solution" files for these yet as well?

That's my next step for the Yank calculations I need to run. I plan to set them up during the weekend.

davidlmobley commented 7 years ago

OK, thanks. Then I won't have my rotation student do it. But can you share code?

andrrizzi commented 7 years ago

I was actually planning to use the Yank pipeline to set them up. With these mol2 files, it can handle the whole preparation, so I don't think there will be code to share.

davidlmobley commented 7 years ago

Ah, OK, thanks. So it's a "provide a mol2 file of the ligand and a PDB file of the complex" kind of thing, then? Sorry, I forget there are so many modes for using Yank.

andrrizzi commented 7 years ago

provide a mol2 file of the ligand and a PDB file of the complex

Almost. The input files of the pipeline are only the mol2 files of the single molecules that I've modified here (both host and guests). The pipeline then produces (unminimized/unequilibrated) amber files and pdb files of both the complex and solvent phases that YANK uses to run the calculations. But if your student plans to use YANK, having the YAML script I'll share + the mol2 files should suffice (of course they can use prepared system files if they prefer as you remember).

davidlmobley commented 7 years ago

Yes, please share yaml!!

davidlmobley commented 7 years ago

@andrrizzi - any updates on the setup of your calculations? I will need to get my rotation student going soon, so I should probably start her working on setting some up for her purposes (cross comparing different forms of restraints) if yours aren't coming along.

andrrizzi commented 7 years ago

@davidlmobley I didn't have time to finish this weekend, sorry, but the script should be ready (and uploaded) by the end of the day.