Thie1e / cutpointr

Optimal cutpoints in R: determining and validating optimal cutpoints in binary classification
https://cran.r-project.org/package=cutpointr
86 stars 13 forks source link

Can we specify the bootstrap sampling size? #54

Closed jwang-lilly closed 2 years ago

jwang-lilly commented 2 years ago

Hi Christian,

Is there any way to specify the sampling size in cutpointr() instead using the same sample size as input data?

Thanks much @Thie1e

Thie1e commented 2 years ago

Hi Jian,

sorry, we don't have that option. You can only choose stratification TRUE / FALSE.

You could write a custom function to do that. Maybe something like this? What question exactly are you trying to answer?

library(cutpointr)
library(tidyverse)
boot_reps <- 10
sample_size = 200
custom_boot <- map_df(1:boot_reps, function(i) {
    # Make smaller sample
    tempdat <- suicide %>% 
        sample_n(sample_size)
    # Bootstrap that sample
    boot_rows <- sample(x = nrow(tempdat), 
                        size = sample_size, 
                        replace = TRUE)
    tempdat <- tempdat[boot_rows, ]
    cutpointr(data = tempdat, 
              x = dsi, 
              class = suicide,
              direction = ">=",
              pos_class = "yes") %>% 
        select(optimal_cutpoint, method, sum_sens_spec, AUC, data)
})

custom_boot
#> # A tibble: 10 x 5
#>    optimal_cutpoint method          sum_sens_spec      AUC data              
#>               <dbl> <chr>                   <dbl>    <dbl> <list>            
#>  1                3 maximize_metric       1.81560 0.932624 <tibble [200 x 2]>
#>  2                2 maximize_metric       1.69910 0.870631 <tibble [200 x 2]>
#>  3                2 maximize_metric       1.59453 0.804247 <tibble [200 x 2]>
#>  4                4 maximize_metric       1.68820 0.842655 <tibble [200 x 2]>
#>  5                1 maximize_metric       1.81152 0.915067 <tibble [200 x 2]>
#>  6                4 maximize_metric       1.78694 0.882732 <tibble [200 x 2]>
#>  7                5 maximize_metric       1.45408 0.683036 <tibble [200 x 2]>
#>  8                2 maximize_metric       1.84239 0.958220 <tibble [200 x 2]>
#>  9                4 maximize_metric       1.85897 0.953329 <tibble [200 x 2]>
#> 10                4 maximize_metric       1.89362 0.975621 <tibble [200 x 2]>

Created on 2022-02-26 by the reprex package (v2.0.1)

jwang-lilly commented 2 years ago

Excellent Christian. Thanks much! I tried to limit the sample size so the results are fair comparisons with other analyses.