aeon-toolkit / aeon

A toolkit for machine learning from time series
https://aeon-toolkit.org/
BSD 3-Clause "New" or "Revised" License
1.02k stars 128 forks source link

[BUG] Different p-values in scatter and MCM #2358

Closed TonyBagnall closed 6 days ago

TonyBagnall commented 1 week ago

Describe the bug

both these plotting tools show a one sided p-value for wilcoxon sign rank test, but in or lates experiments these are different, possibly from calling different functions. Needs investigation

Steps/Code to reproduce the bug

plot both with same data

Expected results

same p vals

Actual results

image image

Versions

No response

hadifawaz1999 commented 6 days ago

can you provide the code that generates this ? is it from a notebook ?

hadifawaz1999 commented 6 days ago

@TonyBagnall in scatter plot the wilcoxon from scipy is called as such: image

in MCM, the default set of wilcoxon is zero_method="pratt" and one sided for alternative

hadifawaz1999 commented 6 days ago

good solution would be @TonyBagnall is to also parametrize the test as input for scatter function and default it to a common thing with mcm, i dont mind the dault what it would be

TonyBagnall commented 6 days ago

ah thanks @hadifawaz1999 you have saved me a job :)

TonyBagnall commented 6 days ago

Definitely default to a standard, I will look into it and choose one

TonyBagnall commented 6 days ago

also need to use the same default in CD drawing. @hadifawaz1999 for now I will standardise with whatever is used by CD drawing. I cannot imagine that it will make any difference

TonyBagnall commented 6 days ago

so CD and scatter use zero_method="wilcox", will switch mcm to that if thats ok @hadifawaz1999

TonyBagnall commented 6 days ago

zero_method = "wilcox" is the default for scipy, so will set to wilcox, but maybe we should pass as argument, possibly as kwargs?

hadifawaz1999 commented 6 days ago

zero_method = "wilcox" is the default for scipy, so will set to wilcox, but maybe we should pass as argument, possibly as kwargs?

For the mcm i added a parameter called pvalue_test_params dictionary, set to None, and if None will set default values of zero method to Pratt and alternative to one sided

I don't mind whats the default behavior of it, but its good to have the same parameter for scatter plots and cd