Closed clbustos closed 12 years ago
self-comment: T.cdf code, based on statistics2, doesn't calculate right cdf with float df. We should use R or GSL code for T.
Just looking at the statistics2 code just now, the function is the same with the exception of the df.is_a? Float test. Don't know if that's the problem, but maybe the arctan function really only does get added when df is odd.
statistics2 code is the problem, really. The ruby and C code fails with df with decimals. I want to port the R or GSL code for t cdf, which works flawlessly. The most troublesome method is incomplete beta function. I didn't achieve to create a working implementation of that.
Ok - so I need to install GSL
Ehhh.. yes, for now.
Report: Seems to give incorrect value. For example:
a=[0,0,0,1,1,1,2,2,2].to_scale b=[2,2,2,3,3,3,4,4,4].to_scale t_2 = Statsample::Test::T::TwoSamplesIndependent.new(a,b) t_2.probability_not_equal_variance
this gives the result: => 0.03333672278567579
However, the result should be: 0.00016053418045947065
Which actually is the value of t_2.probability_equal_variance
The issue seems to be with Distribution::T.cdf which treats df different if its a Fixnum vs a Float.
For the two vectors, where the variance actually is the same, the t statistic and the df should be the same for the equal_variance case and the not_equal_variance case.
But Distribution::T.cdf(-4.8990,16) doesn't give the same result as Distribution::T.cdf(-4.8990,16.0)
I only just started using statsample today, so I don't know if this is a new issue or a long-standing one.