tidymodels / dials

Tools for creating tuning parameter values
https://dials.tidymodels.org/
Other
111 stars 26 forks source link

num_terms (and likely num_comp) throw error in training when doubles supplied instead of integers. #132

Closed yogat3ch closed 3 years ago

yogat3ch commented 4 years ago

Hi folks, I think this is an easy bug to fix.

I kept encountering the note: internal: Error: Must subset elements with a valid subscript vector.\n\033[31mx\033[39m Can't convert from <double> to <integer> due to loss of precision. while training with tune_grid.

It appears to be related to dials::num_terms demonstrated in the reprex below:

.dt <-
  data.frame(
    y = rnorm(20),
    x = rnorm(20),
    time = seq(
      from = lubridate::today() - 19,
      to = lubridate::today(),
      by = "day"
    )
  )

rec <- recipes::recipe(y ~ ., data = .dt)

.roll <- rsample::rolling_origin(
  .dt,
  initial = 5,
  skip = 5,
  cumulative = TRUE
)

par_mars = dials::parameters(
  list(
    num_terms = dials::num_terms(c(1,1)),
    prod_degree = dials::prod_degree(),
    prune_method = dials::prune_method(dials::values_prune_method[!dials::values_prune_method %in% c("backward", "cv", "exhaustive", "none")])
  )
)

mod_mars = parsnip::mars(
  mode = "regression",
  num_terms = tune::tune(),
  prod_degree = tune::tune(),
  prune_method = tune::tune()
) %>% 
  parsnip::set_engine("earth")
.tgr <- tune::tune_grid(
  mod_mars,
  # preprocessor = frm,
  preprocessor = rec,
  resamples = .roll,
  # iter = 2,
  # objective = "rmse",
  param_info = par_mars,
  grid = dials::grid_latin_hypercube(
    par_mars,
    size = 10
  ),
  control = tune::control_grid(
    verbose = TRUE,
    allow_par = TRUE,
    pkgs = "earth"
  )
)

By simply changing: num_terms = dials::num_terms(c(1,1)) to num_terms = dials::num_terms(c(1L,1L)) it appears to resolve the issue.

I think this could be fixed by changing param_num_comp.R#L19 & param_num_comp.R#L32 which currently read: range = range, to range = vctrs::vec_cast.integer(range, "integer"),

Update 2020-07-15 09:05 I can't seem to replicate the fix by changing the range to integers in num_terms anymore. I'm not sure what changed, but this solution no longer seems to work. Update 2020-07-15 09:06 Removing num_terms altogether from the tuning does fix the issue, so num_terms is definitely involved in causing the error - though I'm not sure where.

Thanks!

topepo commented 4 years ago

I suspect that you had an older version of dials. There was no session info in the original issue. Can you provide one now?

yogat3ch commented 4 years ago

Oops, apologies on forgetting that. Dials is 0.0.8. I updated dials (though I don't think it actually updated since 0.0.8 is the most recent). Tested again after doing so and am still encountering the error. I don't have that particular session's info but here is the sessionInfo while working on that same project:

R version 3.5.3 (2019-03-11)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] tune_0.1.1.9000      parsnip_0.1.2        recipes_0.1.10      
 [4] earth_5.1.2          plotmo_3.5.6         TeachingDemos_2.10  
 [7] plotrix_3.7-7        Formula_1.2-3        RevoUtils_11.0.3    
[10] AlpacaforR_1.0.0     printr_0.1           dplyr_1.0.0         
[13] magrittr_1.5         RevoUtilsMath_11.0.0

loaded via a namespace (and not attached):
 [1] httr_1.4.1           tidyr_1.1.0          bit64_0.9-7         
 [4] jsonlite_1.7.0       splines_3.5.3        foreach_1.5.0       
 [7] prodlim_2019.11.13   assertthat_0.2.1     GPfit_1.0-8         
[10] blob_1.2.1           yaml_2.2.1           globals_0.12.5      
[13] ipred_0.9-9          pillar_1.4.6.9000    RSQLite_2.2.0.9000  
[16] lattice_0.20-38      glue_1.4.1.9000      pROC_1.16.2         
[19] digest_0.6.25.1      hardhat_0.1.4        colorspace_1.4-1    
[22] websocket_1.1.0      Matrix_1.2-15        plyr_1.8.6          
[25] timeDate_3043.102    pkgconfig_2.0.3      lhs_1.0.2           
[28] DiceDesign_1.8-1     listenv_0.8.0        purrr_0.3.4.9000    
[31] scales_1.1.1.9000    gower_0.2.2          lava_1.6.7          
[34] tibble_3.0.3.9000    generics_0.0.2       ggplot2_3.3.2       
[37] ellipsis_0.3.1       withr_2.2.0          furrr_0.1.0.9002    
[40] nnet_7.3-12          cli_2.0.2            survival_2.43-3     
[43] crayon_1.3.4         memoise_1.1.0        fs_1.4.1.9000       
[46] future_1.18.0        fansi_0.4.1          MASS_7.3-51.1       
[49] dials_0.0.8          class_7.3-15         tools_3.5.3         
[52] lifecycle_0.2.0.9000 stringr_1.4.0        rsample_0.0.7       
[55] munsell_0.5.0        packrat_0.5.0        compiler_3.5.3      
[58] rlang_0.4.7.9000     grid_3.5.3           iterators_1.0.12    
[61] yardstick_0.0.6      rstudioapi_0.11      HDA_0.0.0.9000      
[64] gtable_0.3.0         codetools_0.2-16     DBI_1.1.0           
[67] R6_2.4.1             lubridate_1.7.9      knitr_1.29.3        
[70] bit_1.1-15.2         workflows_0.1.2      stringi_1.4.7       
[73] parallel_3.5.3       Rcpp_1.0.5.2         vctrs_0.3.2         
[76] rpart_4.1-13         catchr_0.2.2.9000    tidyselect_1.1.0    
[79] xfun_0.15.1

Thanks for checking on this!

topepo commented 3 years ago

Try with version 0.0.9 (on CRAN)

yogat3ch commented 3 years ago

Will do tomorrow and report the results!

yogat3ch commented 3 years ago

Everything checks out! Issue resolved! Thanks @topepo & @juliasilge !

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.