CamDavidsonPilon / Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)
http://camdavidsonpilon.github.io/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
MIT License
26.55k stars 7.85k forks source link

change 'overestimate' to 'underestimate' #372

Closed Edderic closed 6 years ago

Edderic commented 6 years ago

Context

In the part about sorting Reddit reviews by lower-bound of the credible interval, the book says the following:

Why is sorting based on this quantity a good idea? By ordering by the 95% least plausible value, we are being the most conservative with what we think is best. That is, even in the worst case scenario, when we have severely overestimated the upvote ratio, we can be sure the best submissions are still on top. Under this ordering, we impose the following very natural properties

Since we are taking the lower-bound of the CI, by definition, we are more likely underestimating the "true upvote ratio," so we should use 'underestimate' instead of 'overestimate'.

Edderic commented 6 years ago

Hmmm. I think I understand better now why "overestimated" was used, yet I think we could improve the wording here with something like: "when using the lower-bound of the 95% credible interval, we believe with high certainty that the "true upvote ratio" is at least this value or greater." Thoughts?

CamDavidsonPilon commented 6 years ago

Rereading my original text, it does feel confusing. So yes, I like your second wording.

Edderic commented 6 years ago

Awesome! @CamDavidsonPilon I removed the line about overestimating/underestimating and replaced it

Why is sorting based on this quantity a good idea? By ordering by the 95% least plausible value, we are being the most conservative with what we think is best. When using the lower-bound of the 95% credible interval, we believe with high certainty that the "true upvote ratio" is at the very least equal to this value (or greater), thereby ensuring that the best submissions would be on top. Under this ordering, we impose the following very natural properties:...