Simplifying standard state correction application

jchodera commented 9 years ago

I want to try to simplify the way standard state corrections are incorporated in Yank, and was hoping to get suggestions from @luirink and @davidlmobley.

For generality, Yank uses the concept of "phases", where each phase corresponds to one leg of a thermodynamic cycle. This allows us to express calculations like absolute binding free energies, hydration free energies, and relative free energies in the same framework.

Currently, some collections of phases are recognized as needing standard state corrections, but the machinery for this is geared toward recognizing special cases. Instead, I would like to generalize this so that every phase carries with it a appropriate standard state correction (which may be zero), which are summed in the appropriate order to recover the total free energy change along the legs of the thermodynamic cycle.

For the simplest case of protein-ligand binding, this would mean that a ligand:protein complex where the ligand is restrained throughout would include the free energy of transferring the noninteracting ligand from the restraint to a box of standard volume. For an unrestrained ligand:protein simulation, this would be the free energy of transferring the noninteracting ligand from an empty box of the simulation volume to one of standard volume.

What about ligand in solvent simulations, both for binding and transfer free energy calculations? Should this also include the free energy of changing the box size of it is not of standard volume?

And in the case of hydration free energies, we have A vacuum phase that is not conducted in a periodic box. This should presumably have a standard state correction of zero.

davidlmobley commented 9 years ago

It would certainly be nice to have every phase carry an appropriate correction.

What about ligand in solvent simulations, both for binding and transfer free energy calculations? Should this also include the free energy of changing the box size of it is not of standard volume?

Possibly that would be one way to handle things, though it actually might make it more complicated.

For background, recall that typically for something like a hydration or solvation free energy calculation, the standard state ends up being irrelevant because one wants to compute the solvation free energy at a fixed concentration (a "1M to 1M" standard state, basically) and we do the calculation without changing the box size - i.e. in a hydration free energy calculation I compute the free energy of removing a solvent molecule from water but keeping it in the same box size (I'm thinking here of decoupling, but see discussion below), so the concentration does not change.

So, what about standard state corrections? Well, then, they should be zero as long as you're not changing the concentration of solute in your calculations. The way we NORMALLY handle this is just to ignore them since we know they are zero. But in your framework, it seems like it would perhaps be possible to compute the free energy of taking the solvated solute to 1M, and then subtract the free energy of taking the unsolvated solute to 1M. These would cancel so you would end up with no standard state correction. But this also should generalize to a case where someone managed to cook up a protocol which DID change the concentration of solute while changing phases (though I can't at the moment think of why it would make sense to do so in solvation calculations).

And in the case of hydration free energies, we have A vacuum phase that is not conducted in a periodic box. This should presumably have a standard state correction of zero.

OK, this is a little tricky. The discussion above was oriented towards a case where I imagine doing what GROMACS does now and computing the free energy of decoupling the solute in solvent, so I don't have a separate vacuum calculation. In the decoupling case I basically have the original box size for the solute in solution, and an implied separate box for the solute by itself (though I never actually simulate this system by itself - I'm instead simulating the noninteracting solute in solvent). In the "annihilation" case where I turn off the solute in solvent and turn it back on in the gas phase, things are SLIGHTLY different, but as far as I'm aware still no standard state correction is needed on either end. It's a little harder to convince myself as to why this is, but as a starting point let's think of this process:

I turn off the solute's charges and LJ parameters in solution at a specific box size
I turn off the solute's charges and LJ parameters in (non-periodic) gas phase at the same box size In this case you could apply a standard state correction to both systems and it would exactly cancel, since you've done it at the same box size (concentration)

Now imagine we modify the procedure and turn off the solute charges and LJ parameters in non-periodic gas phase at a different box size. Since it is non-periodic we get the same free energy, which is just the free energy of turning off the solute internal interactions (since the box size does not affect this in any way). If we then apply a standard state correction based on the box size we used, it would now NO LONGER cancel with the correction used for the solute in solvent, and this would be incorrect, since our total (1M to 1M) solvation free energy ought not to depend on the box size used for the vacuum calculation.

What am I missing? It seems like I've convinced myself that you shouldn't be applying any standard state corrections if you're doing annihilation, otherwise you'll run into problems when accounting for the concentration used in the gas phase simulation.

Ah, wait, I think I see the issue - in the protocol I just described, I'm missing a leg in the thermodynamic cycle. Specifically, your cycle should involve:

Compute the free energy to annihilate the solute in solution at some concentration, with the end state being the non-interacting solute in some box size plus (separate) water in the same box size
Compute the free energy to transfer the non-interacting solute from that box size to whatever box size is used for the gas phase calculation
Compute the free energy to turn back on the solute interactions in gas phase in that size (non-periodic) box

You have the choice to either: a) include standard state corrections for steps 1 and 3 to get you to 1M, AND to include the contribution of step 2, or b) realize you are already getting concentration-independent results (1M to 1M, effectively) since you are keeping the concentration fixed, and ignore the standard state corrections for steps 1 and 3 and the volume change contribution from step 2 (since these all end up canceling anyway).

So, there you go. That's the issues as I understand them at present.

luirink commented 9 years ago

These would cancel so you would end up with no standard state correction. But this also should generalize to a case where someone managed to cook up a protocol which DID change the concentration of solute while changing phases (though I can't at the moment think of why it would make sense to do so in solvation calculations).

I believe this is generally accepted for solvation free energies in fact in computational chemistry. I think it is only an issue when water is both solvent and solvated species or when there is interaction between the solute periodic images. However, the experimental values for solvation free energies are often at 1atm, not 1M. The free-energy change of 1 mol of an ideal gas from 1 atm (24.46 L/mol, standard state 0) to 1 M (1 mol/L, standard state ): RT ln(V0 ⁄ V) )= RT ln(24.46) )=1.89 kcal ⁄ mol at T=298.15 K see Kelly and Cramer in 2005/2006 for this. However this is not done, most of the times. I am not sure if it should be done.

And in the case of hydration free energies, we have A vacuum phase that is not conducted in a periodic box. This should presumably have a standard state correction of zero.

I agree that the standard state correction should be zero in this case. They should cancel, as showed in the thermodynamic cycle David Mobley mentioned.

For the simplification, I agree to let every phase carry a correction (even when this is zero). This seems also a more transparent way.

jchodera commented 8 years ago

We also have to compute the free energy of imposing the restraints.

Lnaden commented 8 years ago

We will need to address this more broadly, but probably after the refactor of the writer API.

We probably wont get this resolved by 1.0, but it should be addressed shortly after.

1.0 touch.

Lnaden commented 7 years ago

Several of the goals here have been improved by #588, but there may be outstanding issues.

choderalab / yank

Simplifying standard state correction application #305