CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
MIT License
26.55k stars 7.85k forks source link

Chpt3 Explanation of MCMC description misleading #396

Open bjornsturmberg opened 6 years ago

bjornsturmberg commented 6 years ago

I became quite confused by the Algorithms to perform MCMC section.

The issue is that steps 3-4: "3. Accept/Reject the new position based on the position's adherence to the data and prior distributions (ask if the pebble likely came from the mountain). 4.A If you accept: Move to the new position. Return to Step 1. 4.B Else: Do not move to new position. Return to Step 1." imply that you only accept new positions if they are more likely than the current position. Such an algorithm will proceed towards a local maximum, and will then get stuck.

This is not how MCMC algorithms work, as they will also move to new positions that are less likely, but will do so with a low probability, typically related to the difference in likelihood of the current position and the new position.

May I suggest an update to the paragraph that follows directly below the enumerated procedure: "5. After a large number of iterations, return all accepted positions.

New positions are generally accepted if they improve the adherence to the data and prior distribution. New positions that do not improve the adherence may also be accepted, but they are accepted with a low probability related to their adherence. This way we move in the general direction towards the regions where the posterior distributions exist, collecting samples sparingly on the journey, while ensuring that we don't get stuck at a local maximum."

CamDavidsonPilon commented 6 years ago

Hey @bjornsturmberg,

imply that you only accept new positions if they are more likely than the current position. Such an algorithm will proceed towards a local maximum, and will then get stuck.

Rereading my algorithm, I don't fully see your point of view. What language do I use that implies what you are suggesting?

bjornsturmberg commented 6 years ago

Hi @CamDavidsonPilon

For me, it was the analogy of the pebble that led me to think that the algorithm was only accepting more likely parameters: "(ask if the pebble likely came from the mountain)"

The other way I'd thought of suggesting this description could be changed would be to simply remove the pebble reference. Though I think adding those couple of sentences would still clarify the procedure.

I asked a few mates (who first got me on to this fantastic resource) who tended to have experienced a similar interpretation at this point.