SemanticMediaWiki / SemanticResultFormats

Provides additional visualizations (result formats) for Semantic MediaWiki
https://www.semantic-mediawiki.org/wiki/Extension:Semantic_Result_Formats
Other
45 stars 76 forks source link

adding statistical result formats #604

Closed krabina closed 4 years ago

krabina commented 4 years ago

This PR addresses or contains: more math/statistical functions

you can see it here documentation of these new result formats will be provided as soon as the PR is accepted into master

(The calculations of variance and standarddeviation vary if the ask query includes only a sample of the data or all of the data, therefore there are two formats. The same goes for the calculations where in Excel you have ein .inc and .exc formula: here quartilupper = quartile.incl in Excel and quartilupper.exc = quartil.exc in ExcelI)

tested on

MediaWiki | 1.31.7 PHP | 7.2.32 (fpm-fcgi) MariaDB | 10.3.23-MariaDB ICU | 50.2 SMW 3.1.6

This PR includes:

JeroenDeDauw commented 4 years ago

The added code does not depend on SMW and could be put into dedicated pure functions outside of the result printer. This would also make it trivial to write some simple unit tests for them.

krabina commented 4 years ago

What do you mean "does not depend on SMW". It takes values from an ask query and does calculations with it. We just extended what was there in SRF_Math.php to have more functions in the same manner.

Changing SRF_Math.php in general might be a different task outside of the scope of this PR.

JeroenDeDauw commented 4 years ago

The code can be in functions that take the array of numbers and return a number. Those functions do not depend on SMW or MediaWiki, making testing very easy.

I'm not suggesting to improve the existing code. Would be nice to not make the mess worse, which can be done via dedicated functions. The main benefit being increased confidence that the code works as expected and will continue to do so.

krabina commented 4 years ago

I see. So instead of

 case 'average':
    return array_sum( $numbers ) / count( $numbers );
    break;

you would like to see

 case 'average':
    return average( $numbers ) ;
    break;

and at the bottom a private function average that does the actual calulcation?

JeroenDeDauw commented 4 years ago

Yeah - basically. Though to test these you could make them public static functions. And perhaps best to stuff them in a dedicated "class" rather than the result printer.

JeroenDeDauw commented 4 years ago

So

class MathStuffs {
    public static function average( array $numbers ): float {
        return array_sum( $numbers ) / count( $numbers );
    }
}
krabina commented 4 years ago

like this?

krabina commented 4 years ago

This should be ready now from our side...

krabina commented 4 years ago

Median was already part of SRF bevore, btw.