aloctavodia / BAP

Bayesian Analysis with Python (Second Edition)
https://www.amazon.com/dp/B07HHBCR9G
MIT License
646 stars 250 forks source link

Probability of superiority in Chapter2 #78

Closed alexander-pv closed 3 years ago

alexander-pv commented 3 years ago

Hi! Thank you for the book! In Chapter2 I made opposite conclusions based on HPD and the Probability of superiority. I believe that this is because Cohen's d may be negative which affects cumulative normal distribution that calculates the Probability of superiority.

For example, based on the HPD I am able to say that the average tip size differs between Thursday and Sunday. But based on the Probability of superiority there is only a 39% chance that a person visit picked at random from the Sunday group will have a higher tip than a person visit picked at random from the Thursday group. However, when I take the absolute value of Cohen's d for cumulative normal distribution then conclusions based on Probability of superiority are consistent with conclusions based on HPD.

So my suggestion: ps = dist.cdf(np.abs(d_cohen)/(2**0.5))

I also attached screenshots in Jypyter:

Before fix: img1

After fix: img2

aloctavodia commented 3 years ago

Hi @alexander-pv thanks for your comment.

The resported comparison in the book is Thursday - Sunday, the negative value of the Cohen's D means that on Sundays the tip is larger and also that the probability superiority is a 39% chance that a person visit picked at random from the Thursday group will have a higher tip than a person visit picked at random from the Sunday group. If you revert the comparison Sunday - Thursday (which is equivalent to taking the absolute value, as now the differences are positive). You will get the value of 61%. both ways are equivalent as long as you take care of which group (day) is the one used as reference.

alexander-pv commented 3 years ago

Hi @alexander-pv thanks for your comment.

The resported comparison in the book is Thursday - Sunday, the negative value of the Cohen's D means that on Sundays the tip is larger and also that the probability superiority is a 39% chance that a person visit picked at random from the Thursday group will have a higher tip than a person visit picked at random from the Sunday group. If you revert the comparison Sunday - Thursday (which is equivalent to taking the absolute value, as now the differences are positive). You will get the value of 61%. both ways are equivalent as long as you take care of which group (day) is the one used as reference.

Now I see it. Thank you for the explanation!

aloctavodia commented 3 years ago

Glad to help! I am closing now, feel free to reopen this issue or open new issues in the future.