enable numeric and ordinal metrics for ordinal outcome models

Recently, Sakai (2021) compared several class, numeric, and proposed "ordinal" performance measures/metrics on ordinal classification tasks. This raises the questions of (1) what performance measures {yardstick} should make available for ordinal classification models and (2) how to harmonize this decision with package conventions. I don't know what challenges (2) would pose, and anyway they will depend on (1).

I think it's necessary to make measures available that are specifically designed for ordinal classification, in part because there are serious, though separate, theoretical problems with using class and numeric measures. That said, i think there are also good reasons to make both class and numeric measures available:

Commensurability: Compare results to previous work that used class or numeric measures.
Benchmarking: Measure the comparative advantage of using ordinal measures.
Model selection: Assess whether a nominally ordinal outcome can be treated as categorical or integer-valued (for reasons, e.g., of tractability or interpretation).

Because metric_set() (understandably) refuses to mix numeric and class measures, perhaps this would be best achieved by allowing ordinal_reg() and (its and other) ordinal engines to also play in 'regression' mode, while the specifically ordinal measures could require (else error) or expect (else warning) that the outcome is ordered, that the model type or engine is ordinal, or that some other check is passed.

This would unavoidably enable bad practice, but it's bound to come up, and i think it deserves consideration.

mattwarkentin / ordered

enable numeric and ordinal metrics for ordinal outcome models #7