Defining grids with manual tuning versus parameters returns different results

cimentadaj commented 4 years ago

I'm trying to create tuning grids through two different approaches. The first approach is to define a model, populate the parameters with tune(), extract the tuning params with parameters and then pass it to a grid function. Alternatively, I also define tuning grids manually by using the parameter specific function (for example, cost()) in the grid function. To my surprise, both approaches yield different grids even when using the same seed before calling each. Is this expected?

Below are two examples using the same model but manually specifying cost() and another using parameters directly:

library(parsnip)
library(dials)
#> Loading required package: scales
library(tune)
svm_mod <-
  svm_rbf(cost = tune(), rbf_sigma = tune()) %>%
  set_mode("classification") %>%
  set_engine("kernlab")

set.seed(42131)
grid_latin_hypercube(parameters(svm_mod), size = 10)
#> # A tibble: 10 x 2
#>        cost rbf_sigma
#>       <dbl>     <dbl>
#>  1  0.00319  3.30e-10
#>  2  2.01     7.35e- 2
#>  3  0.185    1.86e- 8
#>  4  0.00817  1.82e- 7
#>  5  0.0673   4.54e- 3
#>  6  0.0601   8.10e- 6
#>  7  0.00169  6.81e- 1
#>  8  8.03     1.11e- 9
#>  9  0.952    9.93e- 4
#> 10 20.9      8.88e- 5

set.seed(42131)
grid_latin_hypercube(cost(), rbf_sigma(), size = 10)
#> # A tibble: 10 x 2
#>       cost rbf_sigma
#>      <dbl>     <dbl>
#>  1 0.00199  3.30e-10
#>  2 0.0951   7.35e- 2
#>  3 0.0227   1.86e- 8
#>  4 0.00349  1.82e- 7
#>  5 0.0124   4.54e- 3
#>  6 0.0116   8.10e- 6
#>  7 0.00136  6.81e- 1
#>  8 0.218    1.11e- 9
#>  9 0.0607   9.93e- 4
#> 10 0.388    8.88e- 5

reg_mod <-
  linear_reg(penalty = tune(), mixture = tune()) %>%
  set_engine("glmnet")

set.seed(42131)
grid_latin_hypercube(parameters(reg_mod), size = 10)
#> # A tibble: 10 x 2
#>     penalty mixture
#>       <dbl>   <dbl>
#>  1 1.37e- 9  0.0992
#>  2 2.18e- 3  0.892 
#>  3 1.11e- 5  0.266 
#>  4 1.10e- 8  0.360 
#>  5 1.18e- 6  0.777 
#>  6 9.18e- 7  0.516 
#>  7 3.37e-10  0.984 
#>  8 4.68e- 2  0.149 
#>  9 4.16e- 4  0.715 
#> 10 3.91e- 1  0.615

set.seed(42131)
grid_latin_hypercube(penalty(), mixture(), size = 10)
#> # A tibble: 10 x 2
#>     penalty mixture
#>       <dbl>   <dbl>
#>  1 1.37e- 9  0.0518
#>  2 2.18e- 3  0.887 
#>  3 1.11e- 5  0.227 
#>  4 1.10e- 8  0.326 
#>  5 1.18e- 6  0.766 
#>  6 9.18e- 7  0.491 
#>  7 3.37e-10  0.983 
#>  8 4.68e- 2  0.105 
#>  9 4.16e- 4  0.700 
#> 10 3.91e- 1  0.595

cimentadaj commented 4 years ago

This is happeing, at least for dials:::grid_latin_hypercube, in https://github.com/tidymodels/dials/blob/611260de1729e41843fc3ea9b8baaa916bcd3270/R/space_filling.R#L202

However, I don't understand why, since set.seed should always return the same value in sample.int.

set.seed(32151); sample.int(10^5, 1)
[1] 83640
set.seed(32151); sample.int(10^5, 1)
[1] 83640

juliasilge commented 4 years ago

I'm calling this a bug because it looks like the seed isn't getting propagated correctly somewhere, in one of these two cases.

cimentadaj commented 4 years ago

Ok, so I've been tinkering with this to make it work. Focusing only the linear_reg example, the only parameter to not match the two grid_regular calls is mixture. For example:

library(parsnip)
library(dials)
#> Loading required package: scales
library(tune)

reg_mod <-
  linear_reg(penalty = tune(), mixture = tune()) %>% 
  set_engine("glmnet")

p1 <- parameters(reg_mod)
p2 <- parameters(penalty(), mixture())

set.seed(42131)
grid_regular(p1, levels = 3)
#> # A tibble: 9 x 2
#>        penalty mixture
#>          <dbl>   <dbl>
#> 1 0.0000000001   0.05 
#> 2 0.00001        0.05 
#> 3 1              0.05 
#> 4 0.0000000001   0.525
#> 5 0.00001        0.525
#> 6 1              0.525
#> 7 0.0000000001   1    
#> 8 0.00001        1    
#> 9 1              1

set.seed(42131)
grid_regular(p2, levels = 3)
#> # A tibble: 9 x 2
#>        penalty mixture
#>          <dbl>   <dbl>
#> 1 0.0000000001     0  
#> 2 0.00001          0  
#> 3 1                0  
#> 4 0.0000000001     0.5
#> 5 0.00001          0.5
#> 6 1                0.5
#> 7 0.0000000001     1  
#> 8 0.00001          1  
#> 9 1                1

Here, penalty is the same between both calls yet mixture is not. After diving in a bit, this is because the range in mixture is forced to start at 0.05 when using glmnet. Inside parameters, tunable is called on the first line:

tune:::parameters.model_spec
#> function (x, ...) 
#> {
#>     all_args <- tunable(x)
#>     tuning_param <- tune_args(x)
#>     res <- dplyr::inner_join(tuning_param %>% dplyr::select(-tunable, 
#>         -component_id), all_args, by = c("name", "source", "component")) %>% 
#>         mutate(object = purrr::map(call_info, eval_call_info))
#>     dials::parameters_constr(res$name, res$id, res$source, res$component, 
#>         res$component_id, res$object)
#> }
#> <bytecode: 0x55724d5d3898>
#> <environment: namespace:tune>

And the method tunable.linear_reg explicitly changes this range:

tune:::tunable.linear_reg
#> function (x, ...) 
#> {
#>     res <- NextMethod()
#>     if (x$engine == "glmnet") {
#>         res$call_info[res$name == "mixture"] <- list(list(pkg = "dials", 
#>             fun = "mixture", range = c(0.05, 1)))
#>     }
#>     res
#> }
#> <bytecode: 0x55724d5d6c20>
#> <environment: namespace:tune>

So the problem, at least for linear_reg, is not related to set.seed. Is there a particular reason why this is done? 0 is a completely valid mixture to search for in a tuning grid.

cimentadaj commented 4 years ago

The problems are completely unrelated to the grid_* functions. They all come from parameters. For the first example using svm_rbf, the inconsistency is because the range of values in cost are different between the ones saved internally in parsnip for svm_rbf and the cost function. For example:

library(parsnip)
library(dials)
#> Loading required package: scales
library(tune)

svm_mod <-
  svm_rbf(cost = tune(), rbf_sigma = tune()) %>%
  set_mode("classification") %>%
  set_engine("kernlab")

res <- tune::tunable(svm_mod)
res[res$name == "cost", "call_info", drop = TRUE]
#> [[1]]
#> [[1]]$pkg
#> [1] "dials"
#> 
#> [[1]]$fun
#> [1] "cost"
#> 
#> [[1]]$range
#> [1] -10   5

cost()
#> Cost  (quantitative)
#> Transformer:  log-2 
#> Range (transformed scale): [-10, -1]

In parsnip, the default range is -10 to 5, while in cost it's -10 and -1. Based on this, we can fix the first example to work as expected:

set.seed(42131)
grid_latin_hypercube(parameters(svm_mod), size = 10)
#> # A tibble: 10 x 2
#>        cost rbf_sigma
#>       <dbl>     <dbl>
#>  1  0.00319  3.30e-10
#>  2  2.01     7.35e- 2
#>  3  0.185    1.86e- 8
#>  4  0.00817  1.82e- 7
#>  5  0.0673   4.54e- 3
#>  6  0.0601   8.10e- 6
#>  7  0.00169  6.81e- 1
#>  8  8.03     1.11e- 9
#>  9  0.952    9.93e- 4
#> 10 20.9      8.88e- 5

set.seed(42131)
grid_latin_hypercube(cost(range = c(-10, 5)), rbf_sigma(), size = 10)
#> # A tibble: 10 x 2
#>        cost rbf_sigma
#>       <dbl>     <dbl>
#>  1  0.00319  3.30e-10
#>  2  2.01     7.35e- 2
#>  3  0.185    1.86e- 8
#>  4  0.00817  1.82e- 7
#>  5  0.0673   4.54e- 3
#>  6  0.0601   8.10e- 6
#>  7  0.00169  6.81e- 1
#>  8  8.03     1.11e- 9
#>  9  0.952    9.93e- 4
#> 10 20.9      8.88e- 5

Not sure whether this is intended or it's a bug. In any case, I believe fixing it would be a breaking change in either parsnip or cost.

topepo commented 4 years ago

This is better documented now in the pages for the grid functions.

github-actions[bot] commented 3 years ago

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

tidymodels / dials

Defining grids with manual tuning versus parameters returns different results #109