Closed edgararuiz closed 2 years ago
I made some aesthetic changes in the code, most notably to get rid of "binary" in the title (see 965f8d4).
I added a code comments about test failures on my local machine.
We may want to save the tune_grid()
results as an RData object. We can worry about that later since we may need more of them.
One last question... does the windowed code capture all of the data? When I look at the results of
library(probably)
#>
#> Attaching package: 'probably'
#> The following objects are masked from 'package:base':
#>
#> as.factor, as.ordered
cal_plot_windowed(
segment_logistic,
Class,
.pred_good
)
Created on 2022-11-13 by the reprex package (v2.0.1)
The right-hand side of the line doesn't get near values of 1 on the x-axis (it does get near zero on the left-hand side). Are we getting all the data at the ends?
Looking through the window code, I see that the steps are defined in terms of the data and not the probability range. Sorry, I probably wasn't explicit enough when I described that.
I was thinking that the windows walk along the [0,1] range and capture the observed probability estimates that fall in bins (which probably overlap).
Here's an example of that that would look like:
windows_size <- 0.10
step_size <- 0.05
steps <- seq(0, 1, by = step_size)
lower_cut <- steps - (windows_size / 2)
lower_cut[lower_cut < 0] <- 0
upper_cut <- steps + (windows_size / 2)
upper_cut[upper_cut > 1] <- 1
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(ggplot2)
tibble(
step = steps,
lower = lower_cut,
upper = upper_cut,
group = seq_along(steps)
) %>%
ggplot(aes(x = step, y = group)) +
geom_errorbar(aes(xmin = lower, xmax = upper)) +
theme_bw()
Created on 2022-11-13 by the reprex package (v2.0.1)
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Adds 3 calibration plots: breaks (bins), logistic (gam) and sliding window. They have the corresponding table functions. They also support
data.frame
, andtune_results
objects. R Notebook with sample code and plots: https://colorado.rstudio.com/rsc/edgar/calibration-plots/cal_plots.nb.html