biolab / orange3

🍊 :bar_chart: :bulb: Orange: Interactive data analysis
https://orangedatamining.com
Other
4.85k stars 1.01k forks source link

Widget for statistical comparison of models #3891

Closed ghost closed 4 years ago

ghost commented 5 years ago
Orange version
Expected behavior

A widget should be added to compare model performances, using t-test, wilcoxon signed rank test etc.

Actual behavior

There is no currently available widget for statistical comparison of the model performances

Steps to reproduce the behavior
Additional info (worksheets, data, screenshots, ...)
ghost commented 5 years ago

At least, it should be possible to see all performance metrics in validation process. For instance, in 10-fold cross-validation, the program should show us the the metric for each fold.

janezd commented 5 years ago

I'd rather not promote (frequentist) null-hypothesis testing in Orange (or elsewhere). It is not only conceptually wrong per se, but also doesn't fit well into data mining workflow because constantly reformulating your hypotheses invalidates the tests' premises.

It would be nice though to have a modern Bayesian alternative. Essentially this: https://baycomp.readthedocs.io/en/latest/functions.html#single-data-set, for pairs of methods on single data set. We don't have to depend on baycomp for this, this function is just a Bayesian reinterpretation of the t-test.

janezd commented 4 years ago

Implemented via #4261.