Thie1e / cutpointr

Optimal cutpoints in R: determining and validating optimal cutpoints in binary classification
https://cran.r-project.org/package=cutpointr
84 stars 13 forks source link

Calculating confidence intervals in cutpointr #57

Closed LauraL231 closed 1 year ago

LauraL231 commented 1 year ago

Hi, Is there a way to calculate 95% CIs for PPV and NPV in cutpointr? I have managed to do this for the AUC but not PPV or NPV. Thanks

Thie1e commented 1 year ago

Hi,

did you use the add_metric function to add PPV and NPV to the results? Maybe you can adapt the code below for your application.

The boot_ci function (which you have already found, I assume) extracts certain percentiles from the bootstrap distribution of metric values. You can select whether to use the in-bag or out-of-bag values.

library(cutpointr)
library(tidyverse)

oc <- cutpointr(suicide, dsi, suicide, boot_runs = 1000, metric = sum_sens_spec)
#> Assuming the positive class is yes
#> Assuming the positive class has higher x values
#> Running bootstrap...

oc <- oc %>% 
    add_metric(metric = list(ppv, npv))

oc
#> # A tibble: 1 x 18
#>   direction optimal_cutpoint method          sum_sens_spec      acc sensitivity
#>   <chr>                <dbl> <chr>                   <dbl>    <dbl>       <dbl>
#> 1 >=                       2 maximize_metric       1.75179 0.864662    0.888889
#>   specificity      AUC pos_class neg_class prevalence outcome predictor
#>         <dbl>    <dbl> <fct>     <fct>          <dbl> <chr>   <chr>    
#> 1    0.862903 0.923779 yes       no         0.0676692 suicide dsi      
#>   data               roc_curve                 boot                    ppv
#>   <list>             <list>                    <list>                <dbl>
#> 1 <tibble [532 x 2]> <roc_cutpointr [13 x 10]> <tibble [1,000 x 27]>  0.32
#>        npv
#>      <dbl>
#> 1 0.990741

# in_bag = FALSE -> apply cutpoints to the out-of-bag observations
boot_ci(x = oc, variable = ppv, in_bag = F)
#> # A tibble: 2 x 2
#>   quantile values
#>      <dbl>  <dbl>
#> 1    0.025  0.167
#> 2    0.975  0.447
boot_ci(x = oc, variable = npv, in_bag = F)
#> # A tibble: 2 x 2
#>   quantile values
#>      <dbl>  <dbl>
#> 1    0.025  0.969
#> 2    0.975  1

Created on 2022-11-10 by the reprex package (v2.0.1)