CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
MIT License
26.51k stars 7.84k forks source link

Chapter 2: description regarding the separation plot for Fig. 2.3.2 #542

Open beinstein23 opened 2 years ago

beinstein23 commented 2 years ago

"The black vertical line is the expected number of defects we should observe, given this model. This allows the user to see how the total number of events predicted by the model compares to the actual number of events in the data." The above ordinates form the paragraph under the first separation plot https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/Ch2_MorePyMC_PyMC2.ipynb However, I suppose there might be some misunderstandings: the expected number of defects should be computed by the approach explained in Appendix 2.5, i.e. posterior_probability.sum() in my case, it's about 6.99753 which corresponds to the number of the realized defect 7. However, what you computed within separation_plot.py is N - \sum_i p_i , in my case, it about 16.0047. In my opinion, this makes sense to show how far the blue bar the blue bars should distribute. As you explained in the text: Ideally, all the blue bars should be close to the right-hand side.

But, the description at the beginning of this issue, as I mentioned above, is not exact anymore. Maybe, we could say: The black vertical line is the expected number of defects (counting from right-hand side)

Best wishes,

Beinstein.