alchemistry / alchemlyb

the simple alchemistry library
https://alchemlyb.readthedocs.io
BSD 3-Clause "New" or "Revised" License
185 stars 49 forks source link

Handling with nan values in amber output files #330

Closed yanze039 closed 11 months ago

yanze039 commented 1 year ago

Hi,

I am processing output files from Amber22 TI simulations with ACES functionality (interaction scaling + HREMD). My trajectory at $\lambda=0$ has some frames which have very high energy (even Inf) evaluated at $\lambda=1$

An output example is like below, a simulation under $\lambda=0.0$

MBAR Energy analysis:
Energy at 0.0000 =    -30154.583982
Energy at 0.1768 =    -30152.333791
Energy at 0.2298 =    -30149.869286
Energy at 0.2694 =    -30147.069520
Energy at 0.3027 =    -30143.751525
Energy at 0.3323 =    -30139.633360
Energy at 0.3594 =    -30134.272510
Energy at 0.3849 =    -30126.957766
Energy at 0.4091 =    -30116.517983
Energy at 0.4325 =    -30100.966559
Energy at 0.4553 =    -30076.820870
Energy at 0.4777 =    -30037.740995
Energy at 0.5000 =    -29971.655966
Energy at 0.5223 =    -29854.377971
Energy at 0.5447 =    -29634.387024
Energy at 0.5675 =    -29193.640682
Energy at 0.5909 =    -28236.880738
Energy at 0.6151 =    -25940.650416
Energy at 0.6406 =    -19669.827999
Energy at 0.6677 =       673.131755
Energy at 0.6973 =     84706.107523
Energy at 0.7306 =    585347.101881
Energy at 0.7702 =   6199425.963417
Energy at 0.8232 = 265481151.052853
Energy at 1.0000 = ****************

In this case, decorrelate_u_nk doen't work properly, since it can't read a nan.

/home/gridsan/ywang3/.local/lib/python3.9/site-packages/numpy/core/_methods.py:236: RuntimeWarning: invalid value encountered in subtract
  x = asanyarray(arr - arrmean)
/home/gridsan/ywang3/.conda/envs/amber22/lib/python3.9/site-packages/pymbar/timeseries.py:161: RuntimeWarning: invalid value encountered in subtract
  dA_n = A_n.astype(np.float64) - mu_A
/home/gridsan/ywang3/.conda/envs/amber22/lib/python3.9/site-packages/pymbar/timeseries.py:162: RuntimeWarning: invalid value encountered in subtract
  dB_n = B_n.astype(np.float64) - mu_B
Traceback (most recent call last):
  File "/home/gridsan/ywang3/Project/Capping/AmberTI/eIF4E/analysis.py", line 28, in <module>
    analyze(
  File "/home/gridsan/ywang3/Project/amberti/amberti/analysis.py", line 48, in analyze
    decorrelated_u_nk = decorrelate_u_nk(data2, method='all', remove_burnin=True)
  File "/home/gridsan/ywang3/.conda/envs/amber22/lib/python3.9/site-packages/alchemlyb/preprocessing/subsampling.py", line 69, in decorrelate_u_nk
    return equilibrium_detection(df, series, **kwargs)
  File "/home/gridsan/ywang3/.conda/envs/amber22/lib/python3.9/site-packages/alchemlyb/preprocessing/subsampling.py", line 600, in equilibrium_detection
    indices = _subsample_correlated_data(series_equil, g=statinef)
  File "/home/gridsan/ywang3/.conda/envs/amber22/lib/python3.9/site-packages/pymbar/timeseries.py", line 750, in subsample_correlated_data
    while int(round(n * g)) < T:
ValueError: cannot convert float NaN to integer

I have already used softcore potentials to wrap the changing atoms up here, so I have no idea why amber still outputs ****** here.

I guess we can allow decorrelate_u_nk to read this case and fill nan with a very large value.

xiki-tempula commented 11 months ago

Thanks for reporting this. For u_nk, it is usually recommended to use dE for decorrelation. I wonder why do you want to use all?

xiki-tempula commented 11 months ago

I think in some sense, this is a more statistics problem than a coding problem. How do one compute the self-correlation for a timeseries including NaN.

yanze039 commented 11 months ago

Ohh, yeah, I didn't notice that I used method='all'. After changing to dE, it looks good now. Can I generally ask why dE is better? Thanks!

xiki-tempula commented 11 months ago

Let's say for this specific example, the lambda=1 is infinite, so if you sum all the lambdas up, you would get an infinite number. So for this timeseries, you would just get a long timeseries of infinite number.

xiki-tempula commented 11 months ago

@Dead-fisher Are you happy with the solution that I provided so I could close this ticket?

yanze039 commented 11 months ago

Yeah, absolutely. Thanks!