michellab / Sire

Sire Molecular Simulations Framework
http://siremol.org
GNU General Public License v3.0
95 stars 26 forks source link

TI Gives Non-Zero Free Energy Change with Dummy Perturbation #383

Open fjclark opened 2 years ago

fjclark commented 2 years ago

Hello,

When calculating the free energy change for a perturbation involving only dummy atoms to dummy atoms, MBAR gives exactly zero kcal/ mol, as expected, whereas TI gives positive free energies which vary dramatically with delta_lambda. My version of Sire is only two commits behind devel and the files to reproduce the issue are here.

With the default delta_lambda = 0.001, and two lambda windows at 0 and 1, I get a free energy change of 0.8 kcal / mol. This drops to < 0.1 kcal / mol if I increase delta_lambda to 0.01, and increases substantially when I drop delta_lambda to 0.0005. If I add a lambda window at 0.001, the free energy increases to > 10^6 kcal / mol due to a gradient of > 10^7 kcal / mol for this window. Checking the reduced perturbed energies shows that the energies at the end points are identical, although there is a difference compared to the energies at lambda = 0.001.

By modifying this function I'm able to see that the non-zero gradients are exclusively due to changes in "lamhd". However, because all final and initial parameters are identical, it looks like the lamhd-dependent terms here should be symmetrical about lamdh = 0.5, and therefore I would expect equal and opposite gradients giving an overall free energy change of 0 when lambda windows are symmetrical about 0.5.

I've come across another situation where the gradient seems unreliable, although this could be due to my changes to sire. When turning on Boresch restraints between a protein and ligand (again using a dummy pert file) using this version of Sire and this input, I see very infrequent jumps to extremely negative gradients, while the reduced perturbed energies for neighbouring lambda values remain very similar. I only see this at lambda = 0.000, and I see it for each of five runs. The spikes are at the same values of total simulation time for each run (x-axis should show simulation time from 0 to 5 ns): image

The MBAR estimate for the free energy change is as expected.

Any help with this would be greatly appreciated.

Thanks, Finlay

jmichel80 commented 2 years ago

Hi @fjclark

fjclark commented 2 years ago

Hi @jmichel80

Num bonds 1-4 From Dummy = 73 Num bonds 1-4 From Dummy To Dummy = 0

But this doesn't seem to be the cause of the gradient issue.

jmichel80 commented 2 years ago
fjclark commented 2 years ago

Thanks.

fjclark commented 2 years ago

The issue is very likely due to custom_intra_14_clj (https://github.com/michellab/Sire/blob/2ed5eb41558403ee0fae920f778ca9a6bf823b5c/corelib/src/libs/SireMove/openmmfrenergyst.cpp#L1094) used between atoms which are hard at the start and end. This is because:

  1. It is the only force affected by lamhd, and when I modify the code so that lamhd is not modified when the gradient is calculated, the gradient is zero in all cases.

  2. The free energy change scales with the number of 1-4 interactions between hard atoms. The change is greatest in the protein-ligand system, where the entire error is likely due to interactions in the protein only (no hard atoms in ligand), is much smaller in free calculations where the only 1-4 interactions are due to the ligand (when the .pert file maps fully-interacting to fully-interacting), and is zero in free calculations when I set all ligand atoms to dummies.