Hi-Folks / statistics

PHP package that provides functions for calculating mathematical statistics of numeric data.
https://packagist.org/packages/hi-folks/statistics
MIT License
360 stars 25 forks source link

RFC: use Exceptions #13

Closed trokhymchuk closed 2 years ago

trokhymchuk commented 2 years ago

The library returns null if something wrong happens. That is the subset of an exit status mechanism, but there is only one code (e.g. null). However, that mechanism has some limitations. Firstly, end-user code must handle null everywhere:

function error() {...} // error handling function
$res = Stat::variance([0]);
$someAnotherRes = Stat::median([]);
if ($res === null) {
    error('Data size must be greater than 1');
}
if ($someAnotherRes === null) {
    error('Data must not be empty');
}

Secondly, currently null is an error-indicator (such as if you are trying to calculate mean with no elements) and indicator of the emptiness of a value (such as median of [1, 2], that is no error, but absence of median).

Also, it will clear error messages, as exception allows adding debug information.

About the implementation:

roberto-butti commented 2 years ago

@trokhymchuk Totally agree to implement Exception.

If you agree , I would like to list a table like this one for all methods, listing all cases:

Function Exception Raised Condition
mean StatisticsError $data is empty
geometricMean StatisticsError - $data is empty, - if it contains a zero - if it contains a negative value

If you agree with this, , I will spend some time to list all of that.

trokhymchuk commented 2 years ago

Yeah, that's pretty convenient. But there are not a lot of data-checks.

roberto-butti commented 2 years ago

So i think we need to create InvalidDataInputException as new exception.

Function Exception Raised Condition
mean InvalidDataInputException $data is empty
geometricMean InvalidDataInputException - $data is empty, - if it contains a zero - if it contains a negative value
harmonicMean InvalidDataInputException - if data is empty, - any element is less than zero, - or if the weighted sum isn’t positive.
median InvalidDataInputException if data is empty
medianLow InvalidDataInputException if data is empty
medianHigh InvalidDataInputException if data is empty
mode InvalidDataInputException if data is empty
multimode InvalidDataInputException if data is empty
quantiles InvalidDataInputException if length of array is < 2, and if $n (second parameter) is < 1
firstQuartile InvalidDataInputException if length of array is < 2
thirdQuartile InvalidDataInputException if length of array is < 2
pstdev InvalidDataInputException if data is empty
pvariance InvalidDataInputException if data is empty
stdev InvalidDataInputException if data is empty
variance InvalidDataInputException if length of array is < 2
covariance InvalidDataInputException if 2 arrays have diffferent size - if the length of arrays are < 2 - if the 2 input arrays has not numeric elements
correlation InvalidDataInputException if 2 arrays have diffferent size - if the length of arrays are < 2 - if the 2 input arrays has not numeric elements - if the elements of the array are constants (the same) it means that the denominator is 0
roberto-butti commented 2 years ago

Hi @trokhymchuk I toook a look to your implementation https://github.com/trokhymchuk/statistics/tree/exception I think it is good, for starting this refactoring. So if you agree, you could open a PR

trokhymchuk commented 2 years ago

Since it breaks compatibility, I think you should create branch "0.x" to support that (without exceptions) version.

I've created another branch with InvalidDataInputException: https://github.com/trokhymchuk/statistics/tree/exception-2, which one should I create pull request from?

roberto-butti commented 2 years ago

@trokhymchuk , I agree that breaking compatibility, we need to increase the version. But at this stage of the release process (we are in 0.x, waiting for a good level of maturity for going with 1.0) so I'm going to create 0.2.0. Consider that this package is really new, and we have just some few download (most of them, made by Github actions for testing).

So, if you agree:

roberto-butti commented 2 years ago

merged #15 thanks to @trokhymchuk