Autocorrelation in inference example

pints-team / pints

Probabilistic Inference on Noisy Time Series

http://pints.readthedocs.io

Other

226 stars 33 forks source link

Autocorrelation in inference example #133

Closed MichaelClerx closed 6 years ago

MichaelClerx commented 6 years ago

https://github.com/pints-team/pints/blob/c6f71cc25b6701082b28c49b0868dfa8842d17b7/examples/inference-first-example.ipynb

Doesn't look great to me... @ben18785 @sanmitraghosh any ideas?

chonlei commented 6 years ago

Probably fine?

See e.g. https://stats.stackexchange.com/questions/119879/how-to-interpret-autocorrelation-plot-in-mcmc

But of course would be good to have @ben18785 and @sanmitraghosh 's comments.

chonlei commented 6 years ago

And if we show more...

MichaelClerx commented 6 years ago

https://github.com/matplotlib/matplotlib/issues/9944#event-1376136686

There was an issue with the legends too but they fixed it :-)

mirams commented 6 years ago

That's a much better plot Chon ! Don't autocorrelation plots normally include negative lag too, or is there some symmetry that means we only need to look at positive?

MichaelClerx commented 6 years ago

Ok, Sanmitra thinks it's fine too. @chonlei Could you do a few very very very last things to plot.py and the examples?

Set the default lag parameter to 100 in plot.py
In trace(), set the histogram plotting mode back to ordinary histograms again (the current thing is confusing, I hear)
Fix the issue with the imports
Update https://github.com/pints-team/pints/blob/master/examples/inference-adaptive-covariance-mcmc.ipynb and remove the plot code, replacing it with 1 (only 1 I think!) of the diagnostic plots (maybe the scatter plot with kde?)
Close this ticket

chonlei commented 6 years ago

@mirams Great -- will change it to have a longer lag. Yes, I think the autocorrelation is by definition symmetric at 0 lag. https://en.wikipedia.org/wiki/Autocorrelation#Properties

@MichaelClerx Sure, fixed the first three points. But for the 4th point, I thought that would be a good place to show people how to use pints.plot as a diagnostic tool all in one -- no?

(Will close this ticket once that is done.)

MichaelClerx commented 6 years ago

@chonlei We show them all in the "first example", the example I linked to in point 4 is the one that shows how to do adaptive covariance MCMC. This should just be a quick script showing how you use the method!

chonlei commented 6 years ago

Yup, I read the wrong one. Corrected all in master branch.

sanmitraghosh commented 6 years ago

@chonlei Great plots. kudos.

@MichaelClerx Sadly the people have used lag values from 50 ---1000. So to be honest there is no standard here considering papers and tutorials. However, I have seen a thesis or two where a lot of samplers are compared on ODE problems they tend to use 50,100,200. So I think 100 would be a fair default value.

mirams commented 6 years ago

Number of units on lag will depend on sampling frequency and the timescales in the problem. Can we do something semi-smart and plot it as high as it needs to go so that we have all 'big' entries shown (i.e. everything over 1% of peak or something?)

MichaelClerx commented 6 years ago

Right now we're using acorr() to plot, which makes that impossible, but we could go back to using Ben's FFT based code (only in plot.py, not in the long example where we show how to do plots), so that we can inspect the results before plotting them. So that'd give us a method autocorrelation(lag=None) that would set lag to a semi-smart value if lag is None and otherwise to the explicit integer value given by the user?