GilsonLabUCSD / pAPRika

Advanced toolkit for binding free energy calculations
BSD 3-Clause "New" or "Revised" License
31 stars 14 forks source link

Binding affinity in cb6-but-pbc very different #199

Closed martinj80 closed 2 months ago

martinj80 commented 3 months ago

Hi, upon running the NB with Amber22, I am getting very differrent results from those precalculated in the NB file, also the binding_affinity variable for some reason also contains the unit. Is this OK?

My result: The binding affinity for butane and cucurbit[6]uril = -157.63 kilocalorie / mole +/- 11.34 kilocalorie / mole kcal/mol

jeff231li commented 3 months ago

@martinj80 The binding affinity variable should contain the unit and this is the intended behavior from v1.2.0. I suspect you are getting a different value because the simulations were very short in each window. This is indicated by the magnitude of the error = 11.34 kcal/mol. If you run it longer, perhaps 5ns per window, you would get a closer value to the pre-calculated value in the notebook.

martinj80 commented 3 months ago

That would be first thing to try, even though the difference is really huge (9 vs. 150). I thought the precalculated value was calculated by the same code as supplied, that is why I wrote here. If not, and I suspected so, all is fine.

jeff231li commented 3 months ago

I don't remember how long I ran the simulations on the NB, but I suspect I ran them much longer. One way to debug in a production run is to look at the free energy for the different phases (attach, pull, release). If the free energy for a phase looks incredibly wrong, the next step would be to view the trajectory for the windows in a GUI like VMD or PyMol. Most of the time, you can see what's wrong with the simulation (wrong choice of restraints, etc)

print(free_energy.results["attach"]["ti-block"]["fe"])
print(free_energy.results["pull"]["ti-block"]["fe"])
martinj80 commented 3 months ago

I checked that before, actually. The pull phase is 160kcal, the rest is below 10 so I guess the culprit is the pull phase here...

martinj80 commented 3 months ago

I increased the nstlim to 1ns (500,000steps) but the results are still horrible not much changed. The trajectory seems fine, no obvious problems. I am now running 10 ns production step...

slochower commented 3 months ago

I thought the values in the README are what you'd get if you evaluated the code as written -- and of course we expect fluctuation in the mean ΔG given the high uncertainty, but seeing -160 kcal/mol is definitely outside of what I'd expect. I'm curious what a 10 ns production phase gives you. Is the restraint energy really really high?

jeff231li commented 3 months ago

Also, you could check the free profile across the pull phase. From memory, the free energy for each window is stored in fe_matrix, but I can't remember if you need to remove the units first from the array for matplotlib.

plt.plot(free_energy.results["pull"]["ti-block"]["fe_matrix"][0,:])

You can also plot the histogram overlap between neighboring windows in the pull phase. Normally, there is a connection between skewed free energy values and the (lack of) overlap between histograms. I will need to go through my folders to find the notebook, I'll get back to you on this one.

Cucurbiturils are known to be very rigid and highly polar at the entrance, which makes it hard for the guest molecule to leave the pocket (hence, the unusually large binding affinity for like CB7-8). In the original APR paper (SI of https://pubs.acs.org/doi/10.1021/acs.jctc.5b00405), they had to apply jack restraints on the CB7 to make the portal wider, which improved the overlap and/or convergence in the pull phase.

martinj80 commented 3 months ago

even with 10ns, the value is still similar. I checked the fe_matrix in the pull phase: [-0.0, 1.11, 4.31, 9.49, 12.82, 13.61, 13.82, 12.59, 11.50, 10.62, 10.07, 10.17, 10.14, 9.920, 17.66, 46.76, 97.98, 171.3] problem occurs in the last few windows or in the first windows (is the order ascending or descending)? I see no extreme restrain energies (all approx. 2-13) in the md outputs, although the histograms (almost) do not overlap between 4. and 5. pull window. Otherwise, all looks pretty much normal from what I can see.

jeff231li commented 3 months ago

@martinj80 I think I know what the problem is. If you look at the free energy values for the pull phase, the free energy converges around the 10 kcal/mol value. However, it then jumps to 17.66 and then all the way up to 171.3 kcal/mol. I'm pretty sure this is because the water box is too small and the butane molecule is probably reaching the upper boundary of the box. There are two ways to recover the free energy:

Let me know if this helps fix the issue.

martinj80 commented 3 months ago

Thank you for the suggestions. Increasing the number of water by 1000 improved it, with 50000 steps I got to 13.99 +/- 1.54 kcal/mol.

jeff231li commented 2 months ago

Closing since the question has been answered.