tidymodels / yardstick

Tidy methods for measuring model performance
https://yardstick.tidymodels.org/
Other
369 stars 54 forks source link

Metric: Add metrics that rely on log-likelihood calculation #184

Closed paulponcet closed 3 years ago

paulponcet commented 4 years ago

Popular metrics such as Akaike Information Criterion, Bayesian Information Criterion, or Hannan-Quinn information criterion require additional information on the fitted model in order to be computed, such as log-likelihood or number of observations. Could they be added within yardstick?

topepo commented 4 years ago

There is a problem with using any information criterion with yardstick; many models have no notion of the number of parameters (e.g. random forest, knn, etc). Even when we do, the number of parameters might be more than the number of data points in the testing or assessment sets.

However, you can get these statistics from the training set (if the model can produce them):

library(tidymodels)
#> ── Attaching packages ───────────────────────────────────────────────────────────── tidymodels 0.1.1 ──
#> ✓ broom     0.7.0      ✓ recipes   0.1.13
#> ✓ dials     0.0.9      ✓ rsample   0.0.7 
#> ✓ dplyr     1.0.2      ✓ tibble    3.0.3 
#> ✓ ggplot2   3.3.2      ✓ tidyr     1.1.2 
#> ✓ infer     0.5.2      ✓ tune      0.1.1 
#> ✓ modeldata 0.0.2      ✓ workflows 0.2.0 
#> ✓ parsnip   0.1.3      ✓ yardstick 0.0.7 
#> ✓ purrr     0.3.4
#> ── Conflicts ──────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard() masks scales::discard()
#> x dplyr::filter()  masks stats::filter()
#> x dplyr::lag()     masks stats::lag()
#> x recipes::step()  masks stats::step()
data("two_class_dat")

set.seed(1)
folds <- vfold_cv(two_class_dat)

mod <- logistic_reg() %>% set_engine("glm")

extract_aic <- function(x) {
  # browser()
  # Get the model object then get training set summary stats
  pull_workflow_fit(x) %>% broom::glance()
}

resampled <-
  mod %>% 
  fit_resamples(Class ~ ., folds, control = control_grid(extract = extract_aic))
resampled
#> # Resampling results
#> # 10-fold cross-validation 
#> # A tibble: 10 x 5
#>    splits           id     .metrics         .notes           .extracts       
#>    <list>           <chr>  <list>           <list>           <list>          
#>  1 <split [711/80]> Fold01 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#>  2 <split [712/79]> Fold02 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#>  3 <split [712/79]> Fold03 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#>  4 <split [712/79]> Fold04 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#>  5 <split [712/79]> Fold05 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#>  6 <split [712/79]> Fold06 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#>  7 <split [712/79]> Fold07 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#>  8 <split [712/79]> Fold08 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#>  9 <split [712/79]> Fold09 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>
#> 10 <split [712/79]> Fold10 <tibble [2 × 3]> <tibble [0 × 1]> <tibble [1 × 1]>

resampled_train_stats <- 
  resampled %>% 
  pull(.extracts) %>% 
  map_dfr(~ .x$.extracts[[1]])

resampled_train_stats
#> # A tibble: 10 x 8
#>    null.deviance df.null logLik   AIC   BIC deviance df.residual  nobs
#>            <dbl>   <int>  <dbl> <dbl> <dbl>    <dbl>       <int> <int>
#>  1          979.     710  -311.  627.  641.     621.         708   711
#>  2          981.     711  -298.  602.  615.     596.         709   712
#>  3          975.     711  -294.  595.  608.     589.         709   712
#>  4          982.     711  -299.  604.  617.     598.         709   712
#>  5          977.     711  -303.  612.  626.     606.         709   712
#>  6          979.     711  -307.  619.  633.     613.         709   712
#>  7          977.     711  -309.  625.  639.     619.         709   712
#>  8          980.     711  -310.  625.  639.     619.         709   712
#>  9          981.     711  -296.  597.  611.     591.         709   712
#> 10          978.     711  -305.  615.  629.     609.         709   712

Created on 2020-09-24 by the reprex package (v0.3.0)

juliasilge commented 3 years ago

We don't plan to add information criterion metrics in the short to medium term for the above reasons, but you can determine them yourself when appropriate! 🙌

Let us know if you have further questions.

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.