Closed Edderic closed 6 years ago
Hmmm. I think I understand better now why "overestimated" was used, yet I think we could improve the wording here with something like: "when using the lower-bound of the 95% credible interval, we believe with high certainty that the "true upvote ratio" is at least this value or greater." Thoughts?
Rereading my original text, it does feel confusing. So yes, I like your second wording.
Awesome! @CamDavidsonPilon I removed the line about overestimating/underestimating and replaced it
Why is sorting based on this quantity a good idea? By ordering by the 95% least plausible value, we are being the most conservative with what we think is best. When using the lower-bound of the 95% credible interval, we believe with high certainty that the "true upvote ratio" is at the very least equal to this value (or greater), thereby ensuring that the best submissions would be on top. Under this ordering, we impose the following very natural properties:...
Context
In the part about sorting Reddit reviews by lower-bound of the credible interval, the book says the following:
Why is sorting based on this quantity a good idea? By ordering by the 95% least plausible value, we are being the most conservative with what we think is best. That is, even in the worst case scenario, when we have severely overestimated the upvote ratio, we can be sure the best submissions are still on top. Under this ordering, we impose the following very natural properties
Since we are taking the lower-bound of the CI, by definition, we are more likely underestimating the "true upvote ratio," so we should use 'underestimate' instead of 'overestimate'.