gnu-octave / statistics

The Statistics package for GNU Octave
GNU General Public License v3.0
24 stars 22 forks source link

PiecewiseLinearDistribution cannot compute `mean`, `std`, and `var` when truncated. #127

Closed pr0m1th3as closed 5 months ago

pr0m1th3as commented 5 months ago

There is some issue with mean, std, and var methods in truncated PiecewiseLinearDistribution objects. It seems to be related to integral built-in function throwing an error.

>> pkg load statistics
>> load patients
>> [f, x] = ecdf (Weight);
>> f = f(1:5:end);
>> x = x(1:5:end);
>> pd = PiecewiseLinearDistribution (x, f);
>> t = truncate (pd, 130, 180)
t =
  PiecewiseLinearDistribution

F(111) = 0
F(118) = 0.05
F(124) = 0.13
F(130) = 0.25
F(135) = 0.37
F(142) = 0.5
F(163) = 0.55
F(171) = 0.61
F(178) = 0.7
F(183) = 0.82
F(189) = 0.94
F(202) = 1
  Truncated to the interval [130, 180]

>> mean (pd)
ans = 153.61
>> mean (t)
error: quadcc: integrand F must return a single, real-valued vector of the same size as the input
error: called from
    integral at line 139 column 11
    mean at line 243 column 11
>>
NRJank commented 5 months ago

Integral requires the integrand functions to be vectorized such that it can pass an array of inputs and get an array of outputs. (Parallel processing the quadrature routine.). There is an "ArrayValued" option that will generally let you work with non vectorized functions, but there will be a performance penalty

pr0m1th3as commented 5 months ago

Just pushed a change to address this issue. I don't don't have access to MATLAB at the moment. Can you confirm that the expected values are also true in MATLAB, before I close this issue?


load patients
[f, x] = ecdf (Weight);
f = f(1:5:end);
x = x(1:5:end);
pd = PiecewiseLinearDistribution (x, f);
t = truncate (pd, 130, 180);
mean (t)    # it must be 152.311
std(t)         # it must be 18.2941
var(t)         # it must be 334.6757
pr0m1th3as commented 5 months ago

closing as fixed