Closed Moelf closed 11 months ago
Like that sum((o-e)*(o-e)/e for (o,e) in zip(O,E))
?
Yeah, and preferably also automatically give reduced chi2
Wouldn't you just divide this sum by number of observations?
Yeah but right now we don't have this particular Pearson chi2
Well, Person chi-squared test for goodness of fit (the one you mentioned in the top) implementation does exists. You only need to convert expected counts into proportions and use two-parameter call ChisqTest(O, pᵢ)
, e.g.
# From https://en.wikipedia.org/wiki/Pearson's_chi-squared_test#Fairness_of_dice
O = [5,8,9,8,10,20] # observed counts
E = fill(10, 6) # expected counts = 10
pᵢ = E./sum(E) # get proportions
ChisqTest(O, pᵢ)
Which gives the following result
julia> t = ChisqTest(O, pᵢ)
Pearson's Chi-square Test
-------------------------
Population details:
parameter of interest: Multinomial Probabilities
value under h_0: [0.166667, 0.166667, 0.166667, 0.166667, 0.166667, 0.166667]
point estimate: [0.0833333, 0.133333, 0.15, 0.133333, 0.166667, 0.333333]
95% confidence interval: [(0.0, 0.2111), (0.01667, 0.2611), (0.03333, 0.2777), (0.01667, 0.2611), (0.05, 0.2944), (0.2167, 0.4611)]
Test summary:
outcome with 95% confidence: reject h_0
one-sided p-value: 0.0199
Details:
Sample size: 60
statistic: 13.400000000000002
degrees of freedom: 5
residuals: [-1.58114, -0.632456, -0.316228, -0.632456, 0.0, 3.16228]
std. residuals: [-1.73205, -0.69282, -0.34641, -0.69282, 0.0, 3.4641]
julia> t.stat / t.df # reduced Chi^2
2.6800000000000006
I see, so maybe a possible change is to instead of: https://github.com/JuliaStats/HypothesisTests.jl/blob/be980f3ca89908cf63e60307287fe9fad02c47ad/src/power_divergence.jl#L311
just auto normalize?
It's possible to add this ChisqTest(x::AbstractVector{T}, y::AbstractVector{T}) where {T<:Integer}
override for counts and calculate proportions in it.
Unfortunately our expected is floating points number
Then you have no other way as normalize it. Another option would be a keyword parameter for proportions.
Can you explain your use case a bit more? I understand you have observed and expected counts and just want to make the Chi2 test from that?
The ChiSquaredTest
constructor should really be improved. theta0
should be a keyword argument like for PowerDivergenceTest
, otherwise the risk of confusion with y
it too high.
I'm reluctant to normalize theta0
automatically, as it could hide bugs, but we could add an argument to enable that as @wildart said, and accept any vector of numbers. FWIW R's chisq.test
has p
and rescale.p
arguments for that.
this is fixed
https://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Calculating_the_test-statistic
what's the easiest way to perform this pearson chi2 test?