CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
MIT License
26.75k stars 7.88k forks source link

Ch.3 Bimodal Example Will Not Converge #320

Open ghost opened 8 years ago

ghost commented 8 years ago

I'm working through this book--which has mostly been fantastic--but I am running into a converge issue with the bimodal example given in section 3.1. No matter what I do or how many time I run it (even just a direct copy and paste from the online page), the value of p always converges to zero regardless of its inherent value and one of the standard deviation values is wildly erratic, never converging to anything.

Has anybody else run into this issue? Again, I am directly running the code from the book as-is in a current jupyter notebook:

screen shot 2016-10-05 at 3 30 35 pm
CamDavidsonPilon commented 8 years ago

Can you post what version of PyMC you are using? I remember there are some regressions in PyMC2 around this.

ghost commented 8 years ago

Hi Cam, I appear to have PyMC version 2.3.4. I've had enormous difficulty with installing PyMC on OSX, so I haven't updated in awhile.

Lugrin commented 7 years ago

Assuming there's no bug in PyMC for a minute.

I'd say it converges towards a uni-modal distribution. Cluster 0 has a probability close to 0.0. It can have any distribution since it's not used. We can't expect its parameters to converge. Cluster 1 has a probability close to 1.0. This one cluster is enough to describe the observed data. The distribution of the observed data is the same as the distribution of cluster 1, Normal(170, 45).

Can you check that you're using the correct observation data?

  1. Check that the input file (Chapter3_MCMC/data/mixture_data.csv) contains the same data as the online version.
  2. Start a fresh jupyter/notebook kernel and run all cells. Do you get the same "Histogram of the dataset" as the online version? Do you still get a uni-modal posterior distribution?

Let me know if you're not sure how to check the points above.

Lugrin commented 7 years ago

Duplicate issue: #255 and #245. It's a bug in PyMC2. It's fixed in PyMC 2.3.5.

@jasonnett Happy to close this issue?