The discrepancy in calculating u_kln between parallel-tempering-2dpmf.py and umbrella-sampling.py

After reading the examples for calculating the PMF based on data by parallel-templering and umbrella-sampling simulations, i found that there is a discrepancy in calculating u_kln(reduced potential energy).

Following the line 222 to 226 in (https://github.com/mrshirts/pymbar/blob/master/examples/parallel-tempering-2dpmf/parallel-tempering-2dpmf.py), reduced potential energy( u_kln[k,l,n] ) of trajectory segment n of temperature k evaluated at temperature l was calculated by following:.

    print("Computing reduced potential energies...")
    u_kln = numpy.zeros([K,K,N_max]) # u_kln[k,l,n] is reduced potential energy of trajectory segment n of temperature k evaluated at temperature l
    for k in range(K):
       for l in range(K):
          u_kln[k,l,0:N_k[k]] = beta_k[l] * U_kn[k,0:N_k[k]]

which means that u_kln[k,l,n] = beta_k[l] U_kn[k,n] = beta_k[l] U_kn[k,l,n] ( without bias, U_kn[k,n] = U_kn[k,1,n] = U_kn[k,2,n] = ... = U_kn[k,l,n] = .... ). [equation 1]

While in the examples of analyzing the data by umbrella sampling, as shown in the line 141 to 143 of https://github.com/mrshirts/pymbar/blob/master/examples/umbrella-sampling-pmf/umbrella-sampling.py, it is calculated by:

    # Compute energy of snapshot n from simulation k in umbrella potential l
    u_kln[k,:,n] = u_kn[k,n] + beta_k[k] * (K_k/2.0) * dchi**2

where

    u_kn[k,n] = beta_k[k] * (float(tokens[2]) - float(tokens[1])) # reduced potential energy without umbrella restraint(line 89)

which indicates: u_kln[k,l,n] = beta_k[k] U0_kn[k,n] + beta_k[k] (K_k[l] / 2.0) * dchi[l]2 = betak[k_] ( U0_kn[k,n] + (K_k[l] / 2.0) \ dchi[l]2 ) = betak[k_] * U_kn[k,l,n] [ equation 2]

It seems that the [equation 1] is the correct expression, while the [ equation 2 ] is not.

Generally, based on [ euqation 2 ], for umbrella sampling of constanst temperature(beta_k[k] = beta_k[l]), it is ok, while for different temperature, it will lead to incorrect result.

Here are my suggestions: 1)In line 89 of https://github.com/mrshirts/pymbar/blob/master/examples/umbrella-sampling-pmf/umbrella-sampling.py,

  u_kn[k,n] = beta_k[k] * (float(tokens[2]) - float(tokens[1])) # reduced potential energy without umbrella restraint

should be changed to :

  U_kn[k,n] = float(tokens[2]) - float(tokens[1])  # potential energy without umbrella restraint

2)In line 143,

  u_kln[k,:,n] = u_kn[k,n] + beta_k[k] * (K_k/2.0) * dchi**2

should be changed to:

  u_kln[k,:,n] = beta_k * ( U_kn[k,n] + (K_k/2.0) * dchi**2 )

choderalab / pymbar

The discrepancy in calculating u_kln between parallel-tempering-2dpmf.py and umbrella-sampling.py #479