tidymodels / yardstick

Tidy methods for measuring model performance
https://yardstick.tidymodels.org/
Other
367 stars 54 forks source link

New metric: AUNP #70

Closed DavisVaughan closed 4 years ago

DavisVaughan commented 5 years ago

AUC of each class against the rest, using the a priori class distribution:

I believe this is equivalent to ROC AUC with estimator = "macro_weighted", but I have seen this metric used by itself in a number of places, so it would probably be good to include it as a standalone version.

Reference: Page 30 of: https://www.math.ucdavis.edu/~saito/data/roc/ferri-class-perf-metrics.pdf

Essentially, this would create a wrapper around roc_auc_vec(estimator = "macro_weighted").

If you tackle this issue, it would be great if you could:

The roc_aunp_impl() function will probably just call roc_auc_vec(truth, estimate, options, estimator = "macro_weighted", na_rm = FALSE, ...) (na_rm = FALSE because it would already have been taken care of)

The Custom Metrics vignette will probably be helpful.

juliasilge commented 4 years ago

The mlr package has an implementation of AUNP (as well as AUNU), and I can confirm that the results are the same as using estimator = "macro_weighted".

library(mlr)
#> Loading required package: ParamHelpers
#> 'mlr' is in maintenance mode since July 2019. Future development
#> efforts will go into its successor 'mlr3' (<https://mlr3.mlr-org.com>).
library(tidymodels)
#> ── Attaching packages ────────────────────────────────────────────────────────────────────────────────────────────────────────── tidymodels 0.0.4 ──
#> ✓ broom     0.5.4          ✓ recipes   0.1.9     
#> ✓ dials     0.0.4          ✓ rsample   0.0.5     
#> ✓ dplyr     0.8.4          ✓ tibble    2.1.3     
#> ✓ ggplot2   3.2.1          ✓ tune      0.0.1     
#> ✓ infer     0.5.1          ✓ workflows 0.1.0.9000
#> ✓ parsnip   0.0.5          ✓ yardstick 0.0.5     
#> ✓ purrr     0.3.3
#> ── Conflicts ───────────────────────────────────────────────────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard()    masks scales::discard()
#> x dplyr::filter()     masks stats::filter()
#> x dplyr::lag()        masks stats::lag()
#> x ggplot2::margin()   masks dials::margin()
#> x recipes::step()     masks stats::step()
#> x recipes::yj_trans() masks scales::yj_trans()
library(stringr)
#> 
#> Attaching package: 'stringr'
#> The following object is masked from 'package:recipes':
#> 
#>     fixed
library(mlbench)

data("Soybean")

soybean_split <- initial_split(Soybean)

soybean_task = makeClassifTask(data = training(soybean_split), 
                            target = "Class")

lrn <- makeLearner("classif.rpart", predict.type = "prob")
mlr_mod <- train(lrn, soybean_task)
mlr_mod
#> Model for learner.id=classif.rpart; learner.class=classif.rpart
#> Trained on: task.id = training(soybean_split); obs = 513; features = 35
#> Hyperparameters: xval=0

pred <- predict(mlr_mod, newdata = testing(soybean_split))

measures_mlr <- performance(pred, 
                           measures = list(multiclass.aunu,
                                           multiclass.aunp))
measures_mlr
#> multiclass.aunu multiclass.aunp 
#>       0.9695286       0.9707133

measures_yardstick <- pred$data %>%
  rename_at(vars(starts_with("prob.")), ~ str_remove_all(., "prob.")) %>%
  roc_auc(truth, `2-4-d-injury`:`rhizoctonia-root-rot`, estimator = "macro_weighted")

measures_yardstick
#> # A tibble: 1 x 3
#>   .metric .estimator     .estimate
#>   <chr>   <chr>              <dbl>
#> 1 roc_auc macro_weighted     0.971

testthat::expect_equal(unname(measures_mlr["multiclass.aunp"]),
                       measures_yardstick$.estimate)

Created on 2020-02-05 by the reprex package (v0.3.0)

DavisVaughan commented 4 years ago

Closed by #140

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.