CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
MIT License
26.7k stars 7.87k forks source link

Autocorrelation #224

Open RoyalTS opened 10 years ago

RoyalTS commented 10 years ago

In Chapter 3 your write

"A chain that is [Isn't meandering exploring?] exploring the space well will exhibit very high autocorrelation. Visually, if the trace seems to meander like a river, and not settle down, the chain will have high autocorrelation. "This does not imply that a converged MCMC has low autocorrelation. Hence low autocorrelation is not necessary for convergence, but it is sufficient. PyMC has a built-in autocorrelation plotting function in the Matplot module. "

I assume the parenthetical comment in the first line is due to someone who is as confused as I am. Here's what I don't understand:

First, shouldn't a chain that meanders and doesn't settle down have low autocorrelation? After all, it current value tells me much less about the value next period than it does if the chain only bounces around in some small part of the parameter space. But from your explanation that reasoning is clearly wrong somehow. Second, the "exploring the space well" is a little confusing because "well" implies something good. I suspect what this is supposed to say is "explore lots of the space converging on the part of the space where the bulk of the posterior mass is concentrated", right?

CamDavidsonPilon commented 10 years ago

If I changed

A chain that is

to

A chain that isn't

does everything make sense? I think that's what I meant to write.

RoyalTS commented 10 years ago

That makes a lot more sense.

I'm still a bit confused though. In the hopes that I'm not the only way who has trouble wrapping his mind around this, here goes: If the chain meanders around shouldn't that mean it is less autocorrelated than if it stays in some small region of the parameter space? If you tell me what the nth element is, aren't I able to much more accurately guess what the (n+1)st element will be in the latter case than in the former? What am I missing?

BTW: This is probably not the ideal place to say this, but I love everything about this project of yours. I'm an econ Ph.D. student with lots of stats (some of it Bayesian even) under my belt and never understood what all the fuss was about. It's dawning on me how much I've been missing. So, thanks for the terrific book and being so open about its further development!

CamDavidsonPilon commented 10 years ago

I'm so happy to hear positive things like this. Thanks for sharing!

For clarity, here's my definition of a meandering river:

img

Imagine you are a water molecule in the water. If I know where you are on the water, I can guess where you are going to be next, with decent probability. On the other hand, we say a chain is "mixing well" if it has low correlation. "Mixing well" is more than a saying, ideally you want your chain to behave like this:

hogsback

In this case, I have little idea where you might end up next. This "mixes well" and has low autocorrelation. I think the idea of a "small region" is not being interpreted well - did I write that? What it should be is: the chain covers (i.e. explores) the distribution quite randomly and with low autocorrelation.

cebe commented 10 years ago

It was not my question but imo the explaination using the water pictures is amazing, thanks for that! :smile:

RoyalTS commented 10 years ago

I think the "small region" was my interpretation of this:

When I say MCMC intelligently searches, I really am saying MCMC will hopefully converge towards the areas of high posterior probability. MCMC does this by exploring nearby positions and moving into areas with higher probability. [...] Converging usually implies moving towards a point in space, but MCMC moves towards a broader area in the space and randomly walks in that area, picking up samples from that area.

... which is totally fine. I see now where my confusion came from. Thanks for the explanation. Very helpful.