Open meli365 opened 4 years ago
I have debugged the code and this is what I found:
The line below calculates 1-pvalue
for 'less than' prediction but Mann–Whitney U test
returns identical p-value for both examples.
https://github.com/emjun/tea-lang/blob/a93dc6ad249d9519c6b99b8af034b3e319e216a2/tea/runtimeDataStructures/testResult.py#L184
Note that we are doing two-sided
MW test
https://github.com/emjun/tea-lang/blob/a93dc6ad249d9519c6b99b8af034b3e319e216a2/tea/helpers/evaluateHelperMethods.py#L484
Looking into this further, I don't think this is a bug.
We would not expect that both those inequalities would reject the null hypothesis because we are testing two different hypotheses.
By expressing a one-sided hypothesis, we are not looking for effects in the other direction.
Sport:Wrestling > Swimming
we are only sensitive to the median of Wrestling being greater than the median of Swimming. We do not "look for" the opposite. Sport:Wrestling < Swimming
we are only sensitive to the median of Wresting being less than the median of Swimming. We do not "look for" the opposite.What is confusing is that the null hypothesis is the same for one-sided (greater or lesser) and two-sided hypotheses. The null hypothesis is that there is no difference in medians.
I find this page helpful in providing general background and a similar example: https://stats.idre.ucla.edu/other/mult-pkg/faq/general/faq-what-are-the-differences-between-one-tailed-and-two-tailed-tests/
I think this is a great example of limitations of and opportunities to improve output for Tea and many other tools. @meli365
Working with this example, with lines 24-25 uncommented so that Weight is defined as an ordinal variable.
When using this hypothesis:
results = tea.hypothesize(['Sport', 'Weight'], ['Sport:Wrestling > Swimming'])
the null hypothesis is rejected as expected, with the following output:The median of Weight for Sport = Wrestling is significantly greater than the median for Sport = Swimming.
We expect that flipping the inequality would also result in rejection of the null hypothesis:
results = tea.hypothesize(['Sport', 'Weight'], ['Sport:Wrestling < Swimming'])
but instead it produces the following output:There is no difference in medians between Sport = Swimming and Sport = Wrestling on Weight.
Note: The values of test statistic, p value, alpha, dof, effect size, etc. are the same in the output from both hypotheses.