clbustos / distribution

Statistical Distributions multi library wrapper. Uses Ruby by default and C (statistics2/GSL) or Java extensions where available.
Other
141 stars 52 forks source link

T.cdf on ruby engine doesn't work right #2

Closed clbustos closed 12 years ago

clbustos commented 13 years ago

Report: Seems to give incorrect value. For example:

a=[0,0,0,1,1,1,2,2,2].to_scale b=[2,2,2,3,3,3,4,4,4].to_scale t_2 = Statsample::Test::T::TwoSamplesIndependent.new(a,b) t_2.probability_not_equal_variance

this gives the result: => 0.03333672278567579

However, the result should be: 0.00016053418045947065

Which actually is the value of t_2.probability_equal_variance

The issue seems to be with Distribution::T.cdf which treats df different if its a Fixnum vs a Float.

For the two vectors, where the variance actually is the same, the t statistic and the df should be the same for the equal_variance case and the not_equal_variance case.

But Distribution::T.cdf(-4.8990,16) doesn't give the same result as Distribution::T.cdf(-4.8990,16.0)

I only just started using statsample today, so I don't know if this is a new issue or a long-standing one.

clbustos commented 13 years ago

self-comment: T.cdf code, based on statistics2, doesn't calculate right cdf with float df. We should use R or GSL code for T.

fauman commented 13 years ago

Just looking at the statistics2 code just now, the function is the same with the exception of the df.is_a? Float test. Don't know if that's the problem, but maybe the arctan function really only does get added when df is odd.

clbustos commented 13 years ago

statistics2 code is the problem, really. The ruby and C code fails with df with decimals. I want to port the R or GSL code for t cdf, which works flawlessly. The most troublesome method is incomplete beta function. I didn't achieve to create a working implementation of that.

fauman commented 13 years ago

Ok - so I need to install GSL

clbustos commented 13 years ago

Ehhh.. yes, for now.