New metric: AUNP - Githubissues

DavisVaughan commented 5 years ago

AUC of each class against the rest, using the a priori class distribution:

I believe this is equivalent to ROC AUC with estimator = "macro_weighted", but I have seen this metric used by itself in a number of places, so it would probably be good to include it as a standalone version.

Reference: Page 30 of: https://www.math.ucdavis.edu/~saito/data/roc/ferri-class-perf-metrics.pdf

Essentially, this would create a wrapper around roc_auc_vec(estimator = "macro_weighted").

If you tackle this issue, it would be great if you could:

Use roc_auc() as a general guide for the structure.
[ ] First ensure that I am correct in understanding that this is the same as roc_auc(estimator = "macro_weighted"). Perhaps go out and find another package that implements this metric and check the results of a small example (that's what I would probably do).
[ ] Call it roc_aunp().
[ ] Pay close attention to how we generate documentation and examples automatically
[ ] Try and understand how metric_summarizer() works, along with the rationale behind metric_vec_template(), then use them in the implementation.
[ ] Add a few tests. Small examples you can match from academic papers are best, but otherwise any online example is okay. If all else fails, create an example "by hand" that is easy to manually compute the result for.
[ ] Use the file naming scheme prob-<metric>.R
[ ] No estimator argument would be required for this function. Please document that it is the same as weighted macro averaged roc_auc().
[ ] Please add a reference section to the linked paper.

The roc_aunp_impl() function will probably just call roc_auc_vec(truth, estimate, options, estimator = "macro_weighted", na_rm = FALSE, ...) (na_rm = FALSE because it would already have been taken care of)

The Custom Metrics vignette will probably be helpful.

juliasilge commented 4 years ago

The mlr package has an implementation of AUNP (as well as AUNU), and I can confirm that the results are the same as using estimator = "macro_weighted".

library(mlr)
#> Loading required package: ParamHelpers
#> 'mlr' is in maintenance mode since July 2019. Future development
#> efforts will go into its successor 'mlr3' (<https://mlr3.mlr-org.com>).
library(tidymodels)
#> ── Attaching packages ────────────────────────────────────────────────────────────────────────────────────────────────────────── tidymodels 0.0.4 ──
#> ✓ broom     0.5.4          ✓ recipes   0.1.9     
#> ✓ dials     0.0.4          ✓ rsample   0.0.5     
#> ✓ dplyr     0.8.4          ✓ tibble    2.1.3     
#> ✓ ggplot2   3.2.1          ✓ tune      0.0.1     
#> ✓ infer     0.5.1          ✓ workflows 0.1.0.9000
#> ✓ parsnip   0.0.5          ✓ yardstick 0.0.5     
#> ✓ purrr     0.3.3
#> ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard()    masks scales::discard()
#> x dplyr::filter()     masks stats::filter()
#> x dplyr::lag()        masks stats::lag()
#> x ggplot2::margin()   masks dials::margin()
#> x recipes::step()     masks stats::step()
#> x recipes::yj_trans() masks scales::yj_trans()
library(stringr)
#> 
#> Attaching package: 'stringr'
#> The following object is masked from 'package:recipes':
#> 
#>     fixed
library(mlbench)

data("Soybean")

soybean_split <- initial_split(Soybean)

soybean_task = makeClassifTask(data = training(soybean_split), 
                            target = "Class")

lrn <- makeLearner("classif.rpart", predict.type = "prob")
mlr_mod <- train(lrn, soybean_task)
mlr_mod
#> Model for learner.id=classif.rpart; learner.class=classif.rpart
#> Trained on: task.id = training(soybean_split); obs = 513; features = 35
#> Hyperparameters: xval=0

pred <- predict(mlr_mod, newdata = testing(soybean_split))

measures_mlr <- performance(pred, 
                           measures = list(multiclass.aunu,
                                           multiclass.aunp))
measures_mlr
#> multiclass.aunu multiclass.aunp 
#>       0.9695286       0.9707133

measures_yardstick <- pred$data %>%
  rename_at(vars(starts_with("prob.")), ~ str_remove_all(., "prob.")) %>%
  roc_auc(truth, `2-4-d-injury`:`rhizoctonia-root-rot`, estimator = "macro_weighted")

measures_yardstick
#> # A tibble: 1 x 3
#>   .metric .estimator     .estimate
#>   <chr>   <chr>              <dbl>
#> 1 roc_auc macro_weighted     0.971

testthat::expect_equal(unname(measures_mlr["multiclass.aunp"]),
                       measures_yardstick$.estimate)

^{Created on 2020-02-05 by the reprex package (v0.3.0)}

DavisVaughan commented 4 years ago

Closed by #140

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

tidymodels / yardstick

New metric: AUNP #70