Summary

Our published implementation of Chacko_test_1xc() needs further testing and, potentially, adjustments.

Motivation

Here's an extract of some further thoughts from Prof. Graeme Ruxton connected to issue #36:

I am not sure that the software implementation matches Chacko’s procedure in terms of how the test statistical should be calculated. The key issues to me are that weights can take values above 2 and that these weights are used in the averaging used in the ordering process.

I think the Chacko’s ordering process is as follows.

We begin with a list of k values x1,…xk with associated weights t1 = t2 = …. = tk = 1.

If for any 1≤i ≤ (k-1), we have that xk > xk+1 then we replace both values xk and xk+1 by a single value which is their weighted average (using the weights tk and tk+1), this new value takes the combined weight of the two values it replaces tk + tk+1. The list is now one shorter, so k becomes k-1

We repeat this process until k = 1 or we have a monotone increasing sequence of numbers.

This procedure allows me to recreate the two ordering process examples shown on pages 187 and 189 of Chacko (1966). Notice that Chacko was entirely comfortable with this ordering process ending with a single value. If you look at their table on page 188 then he suggests that under the null hypothesis – if you start with a list of 5 values, then you have a 20% chance that this process that the order process results in a single value.

Even if the outcome is a single value, the test statistic can be calculated from equation 5 on page 188.

The question then becomes how do we obtain a p-value associated with this test statistic.

I agree that just under equation 5, Chacko says that the test statistic is asymptotically chi-squared with m- 1 degrees of freedom (where m is the length of the final order list of values). However, he does not discuss what this means in the event on m = 1. However, as I discuss above, they clearly expect that this will happen sometimes.

If you look at their final example on how they evaluate significance – they say that you reject the null hypothesis if the observed calculated value obtained from equation (5) is greater than the value c obtained from equation 6. This approach does work when m = 1 but seems very cumbersome – requiring calculation of the probability values given on the table on page 188, which requires consultation of Chacko (1963).

Tasks

[ ] Recreate the two ordering process examples shown on pages 187 and 189 of Chacko (1966)
[ ] Add test with a vector of 5 numbers and see if 20% of its permutations end up with a single value
[ ] Allow test statistic to be calculated for single-value outputs (not necessarily p-value, though)
[x] Test implementation of Chacko (1963) table to calculate "p-value range" for m = 1

ocbe-uio / contingencytables

Review Chacko test #38

Summary

Motivation

Tasks