Open sonyahanson opened 8 years ago
Can you zoom in on the DeltaG trace vs time in In[19]
of that notebook? It looks to me like there's just long correlation time fluctuations in the trace that lead these unconverged plots.
You could even compute the correlation time with pymbar.timeseries.statisticalInefficiency()
if you want to quantify this.
Is that what you're referring to?
That certainly looks like the samples for delG are very correlated. If you calculate the statistical inefficiency as John suggests, perhaps you could thin the data by an amount t
, where
g = 1 + 2t,
with g
is the statistical inefficiency and t
is the autocorrelation time. That could help with removing the multi-modality.
Thanks! Will look at that more next week. For now just tested what increasing the iterations would do, and it does seem to help.
It seems like the number of iterations had been reduced for debugging in run_mcmc
in pymcmodels.py
(https://github.com/choderalab/assaytools/blob/master/AssayTools/pymcmodels.py#L455), so despite the fact that it seemed like the number of samples was high from top of the file:
#=============================================================================================
# Parameters for MCMC sampling
#=============================================================================================
DG_min = np.log(1e-15) # kT, most favorable (negative) binding free energy possible; 1 fM
DG_max = +0 # kT, least favorable binding free energy possible
niter = 500000 # number of iterations
nburn = 50000 # number of burn-in iterations to discard
nthin = 500 # thinning interval
The actual numbers used for the figures above were considerably smaller (nthin = 20, nburn = 20 000, niter = 40 000
). Below I've commented out the # DEBUG
section and rerun the same analysis so that now nthin = 20, nburn = 200 000, niter = 400 000
. So that's ten times more sampling than above, though half of it is being dropped as 'burn in'. I ran it twice just to see the rough consistency of the results.
Run 1:
Run 2:
Presumably the delG trace is for SrcErl? It seems to have an odd behaviour with jumping down to much lower binding free energies.
Yes, it's for SrcErl.
We definitely see this jumping behavior in the delG trace most prominently for the data with prominent outliers, so maybe we can kill two birds with one stone (#52):
See below images from the second half of this notebook: https://github.com/choderalab/assaytools/blob/master/data/spectra/Spectra-Analysis.ipynb
Hoping this will clear up with increased sampling...