Closed ghost closed 4 years ago
At least, it should be possible to see all performance metrics in validation process. For instance, in 10-fold cross-validation, the program should show us the the metric for each fold.
I'd rather not promote (frequentist) null-hypothesis testing in Orange (or elsewhere). It is not only conceptually wrong per se, but also doesn't fit well into data mining workflow because constantly reformulating your hypotheses invalidates the tests' premises.
It would be nice though to have a modern Bayesian alternative. Essentially this: https://baycomp.readthedocs.io/en/latest/functions.html#single-data-set, for pairs of methods on single data set. We don't have to depend on baycomp for this, this function is just a Bayesian reinterpretation of the t-test.
Implemented via #4261.
Orange version
Expected behavior
A widget should be added to compare model performances, using t-test, wilcoxon signed rank test etc.
Actual behavior
There is no currently available widget for statistical comparison of the model performances
Steps to reproduce the behavior
Additional info (worksheets, data, screenshots, ...)