tidymodels / dials

Tools for creating tuning parameter values
https://dials.tidymodels.org/
Other
111 stars 26 forks source link

Can't tune recipes::step_window size parameter #180

Closed camroberts closed 3 years ago

camroberts commented 3 years ago

The problem

I'm having trouble with setting up step_window to tune the size parameter. The parameter doesn't seem to be recognized by dials::parameters and the result of tunable looks to be inconsistent. It lists a different parameter name (window instead of size) and the default method in call_info is listed as dials::window instead of dials::window_size.

This looks related to issue #112, but it suggests the problem is isolated to R 4.0 while I'm using R 3.6.3.

For completeness, I also tried to tune the statistic parameter of step_window as it is listed as tunable too. However an error is produced upon creation of the recipe.

Reproducible example

library(tidyverse)
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip

data <- tibble(x = 1:1000,
               y = 3 + .1*x + rnorm(1000))

rec <- recipe(y ~ ., data=data) %>% 
  step_window(x, size=tune())

# The size param is not listed
parameters(rec)
#> Collection of 0 parameters for tuning
#> 
#> [1] identifier type       object    
#> <0 rows> (or 0-length row.names)

# What's tunable?
t <- tunable(rec)
t
#> # A tibble: 2 x 5
#>   name      call_info        source component   component_id
#>   <chr>     <list>           <chr>  <chr>       <chr>       
#> 1 statistic <named list [2]> recipe step_window window_WlPmQ
#> 2 window    <named list [2]> recipe step_window window_WlPmQ

# The "size" arg in step_window is missing. Listed as "window" instead.
# What is the default func?
t$call_info[[2]]
#> $pkg
#> [1] "dials"
#> 
#> $fun
#> [1] "window"
# dials::window doesn't exist. I think it should be dials::window_size.

# tunable says that the "statistic" param is tunable too. But this causes another error:
rec2 <- recipe(y ~ ., data=data) %>% 
  step_window(x, statistic=tune())
#> Error in match(x, table, nomatch = 0L): 'match' requires vector arguments

Created on 2021-09-01 by the reprex package (v2.0.0)

Session info ``` r sessioninfo::session_info() #> - Session info --------------------------------------------------------------- #> setting value #> version R version 3.6.3 (2020-02-29) #> os Windows 10 x64 #> system x86_64, mingw32 #> ui RTerm #> language (EN) #> collate English_Australia.1252 #> ctype English_Australia.1252 #> tz Australia/Brisbane #> date 2021-09-01 #> #> - Packages ------------------------------------------------------------------- #> package * version date lib source #> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1) #> backports 1.2.1 2020-12-09 [1] CRAN (R 3.6.3) #> broom * 0.7.6 2021-04-05 [1] CRAN (R 3.6.3) #> cellranger 1.1.0 2016-07-27 [1] CRAN (R 3.6.1) #> class 7.3-18 2021-01-24 [1] CRAN (R 3.6.3) #> cli 2.5.0 2021-04-26 [1] CRAN (R 3.6.3) #> codetools 0.2-18 2020-11-04 [1] CRAN (R 3.6.3) #> colorspace 2.0-0 2020-11-11 [1] CRAN (R 3.6.3) #> crayon 1.4.1 2021-02-08 [1] CRAN (R 3.6.3) #> DBI 1.1.1 2021-01-15 [1] CRAN (R 3.6.3) #> dbplyr 2.1.1 2021-04-06 [1] CRAN (R 3.6.3) #> dials * 0.0.9 2020-09-16 [1] CRAN (R 3.6.3) #> DiceDesign 1.9 2021-02-13 [1] CRAN (R 3.6.3) #> digest 0.6.27 2020-10-24 [1] CRAN (R 3.6.3) #> dplyr * 1.0.5 2021-03-05 [1] CRAN (R 3.6.3) #> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 3.6.3) #> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1) #> fansi 0.4.2 2021-01-15 [1] CRAN (R 3.6.3) #> forcats * 0.5.1 2021-01-27 [1] CRAN (R 3.6.3) #> foreach 1.5.1 2020-10-15 [1] CRAN (R 3.6.3) #> fs 1.5.0 2020-07-31 [1] CRAN (R 3.6.3) #> furrr 0.2.2 2021-01-29 [1] CRAN (R 3.6.3) #> future 1.21.0 2020-12-10 [1] CRAN (R 3.6.3) #> generics 0.1.0 2020-10-31 [1] CRAN (R 3.6.3) #> ggplot2 * 3.3.3 2020-12-30 [1] CRAN (R 3.6.3) #> globals 0.14.0 2020-11-22 [1] CRAN (R 3.6.3) #> glue 1.4.2 2020-08-27 [1] CRAN (R 3.6.3) #> gower 0.2.2 2020-06-23 [1] CRAN (R 3.6.3) #> GPfit 1.0-8 2019-02-08 [1] CRAN (R 3.6.3) #> gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.1) #> hardhat 0.1.6 2021-07-14 [1] CRAN (R 3.6.3) #> haven 2.3.1 2020-06-01 [1] CRAN (R 3.6.3) #> highr 0.8 2019-03-20 [1] CRAN (R 3.6.1) #> hms 1.0.0 2021-01-13 [1] CRAN (R 3.6.3) #> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 3.6.3) #> httr 1.4.2 2020-07-20 [1] CRAN (R 3.6.3) #> infer * 0.5.4 2021-01-13 [1] CRAN (R 3.6.3) #> ipred 0.9-11 2021-03-12 [1] CRAN (R 3.6.3) #> iterators 1.0.13 2020-10-15 [1] CRAN (R 3.6.3) #> jsonlite 1.7.2 2020-12-09 [1] CRAN (R 3.6.3) #> knitr 1.31 2021-01-27 [1] CRAN (R 3.6.3) #> lattice 0.20-41 2020-04-02 [1] CRAN (R 3.6.3) #> lava 1.6.9 2021-03-11 [1] CRAN (R 3.6.3) #> lhs 1.1.1 2020-10-05 [1] CRAN (R 3.6.3) #> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 3.6.3) #> listenv 0.8.0 2019-12-05 [1] CRAN (R 3.6.1) #> lubridate 1.7.10 2021-02-26 [1] CRAN (R 3.6.3) #> magrittr 2.0.1 2020-11-17 [1] CRAN (R 3.6.3) #> MASS 7.3-53.1 2021-02-12 [1] CRAN (R 3.6.3) #> Matrix 1.3-2 2021-01-06 [1] CRAN (R 3.6.3) #> modeldata * 0.1.1 2021-07-14 [1] CRAN (R 3.6.3) #> modelr 0.1.8 2020-05-19 [1] CRAN (R 3.6.3) #> munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.1) #> nnet 7.3-15 2021-01-24 [1] CRAN (R 3.6.3) #> parallelly 1.24.0 2021-03-14 [1] CRAN (R 3.6.3) #> parsnip * 0.1.7 2021-07-21 [1] CRAN (R 3.6.3) #> pillar 1.5.1 2021-03-05 [1] CRAN (R 3.6.3) #> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.1) #> plyr 1.8.6 2020-03-03 [1] CRAN (R 3.6.3) #> pROC 1.17.0.1 2021-01-13 [1] CRAN (R 3.6.3) #> prodlim 2019.11.13 2019-11-17 [1] CRAN (R 3.6.3) #> ps 1.6.0 2021-02-28 [1] CRAN (R 3.6.3) #> purrr * 0.3.4 2020-04-17 [1] CRAN (R 3.6.3) #> R6 2.5.0 2020-10-28 [1] CRAN (R 3.6.3) #> Rcpp 1.0.6 2021-01-15 [1] CRAN (R 3.6.3) #> readr * 1.4.0 2020-10-05 [1] CRAN (R 3.6.3) #> readxl 1.3.1 2019-03-13 [1] CRAN (R 3.6.1) #> recipes * 0.1.16 2021-04-16 [1] CRAN (R 3.6.3) #> reprex 2.0.0 2021-04-02 [1] CRAN (R 3.6.3) #> rlang 0.4.10 2020-12-30 [1] CRAN (R 3.6.3) #> rmarkdown 2.7 2021-02-19 [1] CRAN (R 3.6.3) #> rpart 4.1-15 2019-04-12 [1] CRAN (R 3.6.3) #> rsample * 0.1.0 2021-05-08 [1] CRAN (R 3.6.3) #> rstudioapi 0.13 2020-11-12 [1] CRAN (R 3.6.3) #> rvest 1.0.0 2021-03-09 [1] CRAN (R 3.6.3) #> scales * 1.1.1 2020-05-11 [1] CRAN (R 3.6.3) #> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1) #> stringi 1.5.3 2020-09-09 [1] CRAN (R 3.6.3) #> stringr * 1.4.0 2019-02-10 [1] CRAN (R 3.6.1) #> styler 1.4.1 2021-03-30 [1] CRAN (R 3.6.3) #> survival 3.2-10 2021-03-16 [1] CRAN (R 3.6.3) #> tibble * 3.1.0 2021-02-25 [1] CRAN (R 3.6.3) #> tidymodels * 0.1.2 2020-11-22 [1] CRAN (R 3.6.3) #> tidyr * 1.1.3 2021-03-03 [1] CRAN (R 3.6.3) #> tidyselect 1.1.0 2020-05-11 [1] CRAN (R 3.6.3) #> tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 3.6.3) #> timeDate 3043.102 2018-02-21 [1] CRAN (R 3.6.2) #> tune * 0.1.6 2021-07-21 [1] CRAN (R 3.6.3) #> utf8 1.2.1 2021-03-12 [1] CRAN (R 3.6.3) #> vctrs 0.3.7 2021-03-29 [1] CRAN (R 3.6.3) #> withr 2.4.2 2021-04-18 [1] CRAN (R 3.6.3) #> workflows * 0.2.3 2021-07-16 [1] CRAN (R 3.6.3) #> xfun 0.22 2021-03-11 [1] CRAN (R 3.6.3) #> xml2 1.3.2 2020-04-23 [1] CRAN (R 3.6.3) #> yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.2) #> yardstick * 0.0.8 2021-03-28 [1] CRAN (R 3.6.3) #> #> [1] C:/Users/robecd/AppData/Local/R/R-3.6.3/library ```

P.S. Many thanks for the truly fantastic work on tidymodels!

hfrick commented 3 years ago

I believe the tunable method for step_window() needs to be fixed in recipes to point to dials::window_size() instead of the non-existing dials::window().

re a tunable statistic: @topepo do you know if this is a missing dials parameter? I didn't see one that would lend itself quite as obviously as window_size() for the size parameter.

topepo commented 3 years ago

I think that there are two things.

First, we'll need a dial object for this. There is a set list of functions that can be used (recipes:::roll_funs)

Second, step_window() gets the input but doesn't like non-character values:

    if (!(statistic %in% roll_funs) | length(statistic) != 1)
      rlang::abort(
        paste0(
        "`statistic` should be one of: ",
        paste0("'", roll_funs, "'", collapse = ", ")
          )
        )

We'd have to add a !is_call(statistic) in there.

hfrick commented 3 years ago

Thanks for reporting this @camroberts ! You should now be able to tune both the window and the statistic arguments of step_window() with the development versions of recipes and dials.

camroberts commented 3 years ago

Thank you!

github-actions[bot] commented 2 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.