tidymodels / probably

Tools for post-processing class probability estimates
https://probably.tidymodels.org/
Other
111 stars 12 forks source link

Inconsistent appearance of `distance` metric in `threshold_perf()` #149

Open wjakethompson opened 1 week ago

wjakethompson commented 1 week ago

Currently, the distance metric is appearing inconsistently when using threshold_perf(). The function documentation states that the default calculated metrics are j_index(), sens(), spec(), and distance. However, when I use the defaults, the distance metric does not appear.

library(tidyverse)
library(probably)
library(yardstick)

set.seed(1234)

dat <- tibble(truth = factor(sample(c("a", "b"), size = 500, replace = TRUE)),
              estimate = runif(n = 500))

threshold_perf(dat, truth = truth, estimate = estimate,
               thresholds = 0.5)
#> # A tibble: 3 × 4
#>   .threshold .metric     .estimator .estimate
#>        <dbl> <chr>       <chr>          <dbl>
#> 1        0.5 sensitivity binary      0.522   
#> 2        0.5 specificity binary      0.478   
#> 3        0.5 j_index     binary      0.000528

Now let’s say I just want sensitivity and specificity. I provide a custom metric set that includes only sens() and spec(). Here I have specified that I only want two metrics, but distance does appear, unexpectedly.

threshold_perf(dat, truth = truth, estimate = estimate,
               thresholds = 0.5,
               metrics = metric_set(sens, spec))
#> # A tibble: 3 × 4
#>   .threshold .metric  .estimator .estimate
#>        <dbl> <chr>    <chr>          <dbl>
#> 1        0.5 sens     binary         0.522
#> 2        0.5 spec     binary         0.478
#> 3        0.5 distance binary         0.500

Based on the documentation, I would expect distance to show up in the first case when the defaults are used. The documentation is a little unclear on what the expected behavior should be in the second case. The documentation says:

If a custom metric is passed that does not compute sensitivity and specificity, the distance metric is not computed.

This makes it sound like distance will always be included if both sensitivity and specificity are included in the custom metric set, even if it wasn’t explicitly asked for. If so, then the second case is performing as expected. However, since we explicitly defined the desired metrics in the second case, I think the expectation would be that those would be the only metrics returned.

Created on 2024-06-24 with reprex v2.1.0

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.4.1 (2024-06-14) #> os macOS Sonoma 14.5 #> system aarch64, darwin20 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz America/Chicago #> date 2024-06-24 #> pandoc 3.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> class 7.3-22 2023-05-03 [2] CRAN (R 4.4.1) #> cli 3.6.2 2023-12-11 [1] CRAN (R 4.4.0) #> codetools 0.2-20 2024-03-31 [2] CRAN (R 4.4.1) #> colorspace 2.1-0 2023-01-23 [1] CRAN (R 4.4.0) #> data.table 1.15.4 2024-03-30 [1] CRAN (R 4.4.0) #> dials 1.2.1 2024-02-22 [1] CRAN (R 4.4.0) #> DiceDesign 1.10 2023-12-07 [1] CRAN (R 4.4.0) #> digest 0.6.35 2024-03-11 [1] CRAN (R 4.4.0) #> dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.4.0) #> evaluate 0.24.0 2024-06-10 [1] CRAN (R 4.4.0) #> fansi 1.0.6 2023-12-08 [1] CRAN (R 4.4.0) #> fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.4.0) #> forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.4.0) #> foreach 1.5.2 2022-02-02 [1] CRAN (R 4.4.0) #> fs 1.6.4 2024-04-25 [1] CRAN (R 4.4.0) #> furrr 0.3.1 2022-08-15 [1] CRAN (R 4.4.0) #> future 1.33.2 2024-03-26 [1] CRAN (R 4.4.0) #> future.apply 1.11.2 2024-03-28 [1] CRAN (R 4.4.0) #> generics 0.1.3 2022-07-05 [1] CRAN (R 4.4.0) #> ggplot2 * 3.5.1 2024-04-23 [1] CRAN (R 4.4.0) #> globals 0.16.3 2024-03-08 [1] CRAN (R 4.4.0) #> glue 1.7.0 2024-01-09 [1] CRAN (R 4.4.0) #> gower 1.0.1 2022-12-22 [1] CRAN (R 4.4.0) #> GPfit 1.0-8 2019-02-08 [1] CRAN (R 4.4.0) #> gtable 0.3.5 2024-04-22 [1] CRAN (R 4.4.0) #> hardhat 1.4.0 2024-06-02 [1] CRAN (R 4.4.0) #> hms 1.1.3 2023-03-21 [1] CRAN (R 4.4.0) #> htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0) #> ipred 0.9-14 2023-03-09 [1] CRAN (R 4.4.0) #> iterators 1.0.14 2022-02-05 [1] CRAN (R 4.4.0) #> knitr 1.47 2024-05-29 [1] CRAN (R 4.4.0) #> lattice 0.22-6 2024-03-20 [2] CRAN (R 4.4.1) #> lava 1.8.0 2024-03-05 [1] CRAN (R 4.4.0) #> lhs 1.1.6 2022-12-17 [1] CRAN (R 4.4.0) #> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.4.0) #> listenv 0.9.1 2024-01-29 [1] CRAN (R 4.4.0) #> lubridate * 1.9.3 2023-09-27 [1] CRAN (R 4.4.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.4.0) #> MASS 7.3-60.2 2024-04-26 [2] CRAN (R 4.4.1) #> Matrix 1.7-0 2024-04-26 [2] CRAN (R 4.4.1) #> munsell 0.5.1 2024-04-01 [1] CRAN (R 4.4.0) #> nnet 7.3-19 2023-05-03 [2] CRAN (R 4.4.1) #> parallelly 1.37.1 2024-02-29 [1] CRAN (R 4.4.0) #> parsnip 1.2.1 2024-03-22 [1] CRAN (R 4.4.0) #> pillar 1.9.0 2023-03-22 [1] CRAN (R 4.4.0) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.4.0) #> probably * 1.0.3 2024-02-23 [1] CRAN (R 4.4.0) #> prodlim 2023.08.28 2023-08-28 [1] CRAN (R 4.4.0) #> purrr * 1.0.2 2023-08-10 [1] CRAN (R 4.4.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.4.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.4.0) #> R.oo 1.26.0 2024-01-24 [1] CRAN (R 4.4.0) #> R.utils 2.12.3 2023-11-18 [1] CRAN (R 4.4.0) #> R6 2.5.1 2021-08-19 [1] CRAN (R 4.4.0) #> Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.4.0) #> readr * 2.1.5 2024-01-10 [1] CRAN (R 4.4.0) #> recipes 1.0.10 2024-02-18 [1] CRAN (R 4.4.0) #> reprex 2.1.0 2024-01-11 [1] CRAN (R 4.4.0) #> rlang 1.1.4 2024-06-04 [1] CRAN (R 4.4.0) #> rmarkdown 2.27 2024-05-17 [1] CRAN (R 4.4.0) #> rpart 4.1.23 2023-12-05 [2] CRAN (R 4.4.1) #> rsample 1.2.1 2024-03-25 [1] CRAN (R 4.4.0) #> rstudioapi 0.16.0 2024-03-24 [1] CRAN (R 4.4.0) #> scales 1.3.0 2023-11-28 [1] CRAN (R 4.4.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.4.0) #> stringi 1.8.4 2024-05-06 [1] CRAN (R 4.4.0) #> stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.4.0) #> styler 1.10.3 2024-04-07 [1] CRAN (R 4.4.0) #> survival 3.6-4 2024-04-24 [2] CRAN (R 4.4.1) #> tibble * 3.2.1 2023-03-20 [1] CRAN (R 4.4.0) #> tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.4.0) #> tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.4.0) #> tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.4.0) #> timechange 0.3.0 2024-01-18 [1] CRAN (R 4.4.0) #> timeDate 4032.109 2023-12-14 [1] CRAN (R 4.4.0) #> tune 1.2.1 2024-04-18 [1] CRAN (R 4.4.0) #> tzdb 0.4.0 2023-05-12 [1] CRAN (R 4.4.0) #> utf8 1.2.4 2023-10-22 [1] CRAN (R 4.4.0) #> vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.4.0) #> withr 3.0.0 2024-01-16 [1] CRAN (R 4.4.0) #> workflows 1.1.4 2024-02-19 [1] CRAN (R 4.4.0) #> xfun 0.45 2024-06-16 [1] CRAN (R 4.4.0) #> yaml 2.3.8 2023-12-11 [1] CRAN (R 4.4.0) #> yardstick * 1.3.1 2024-03-21 [1] CRAN (R 4.4.0) #> #> [1] /Users/jakethompson/Library/R/arm64/4.4/library #> [2] /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```