Widget for statistical comparison of models

ghost commented 5 years ago

Orange version

Expected behavior

A widget should be added to compare model performances, using t-test, wilcoxon signed rank test etc.

Actual behavior

There is no currently available widget for statistical comparison of the model performances

Steps to reproduce the behavior

Additional info (worksheets, data, screenshots, ...)

ghost commented 5 years ago

At least, it should be possible to see all performance metrics in validation process. For instance, in 10-fold cross-validation, the program should show us the the metric for each fold.

janezd commented 5 years ago

I'd rather not promote (frequentist) null-hypothesis testing in Orange (or elsewhere). It is not only conceptually wrong per se, but also doesn't fit well into data mining workflow because constantly reformulating your hypotheses invalidates the tests' premises.

It would be nice though to have a modern Bayesian alternative. Essentially this: https://baycomp.readthedocs.io/en/latest/functions.html#single-data-set, for pairs of methods on single data set. We don't have to depend on baycomp for this, this function is just a Bayesian reinterpretation of the t-test.

janezd commented 4 years ago

Implemented via #4261.

biolab / orange3