tidymodels / extratests

Integration and other testing for tidymodels
Other
20 stars 1 forks source link

new failures related to space-filling grid default in dev tune #217

Closed simonpcouch closed 3 months ago

simonpcouch commented 3 months ago

We're seeing a couple new failures related to the new space-filling grids here. Most of them are just snapshot changes with new hyperparameter values, but one is actually an issue.

The new space-filling grids propose hyperparameter values right on the edges of their ranges for numeric parameters. For the ranger engine's regularization.factor hyperparameter, this surfaced a new failure. Turns out it's because the space-filling grids propose a regularization of 0, and ranger trips up on it, resulting in missing metrics for one grid point:

library(tidymodels)

rf <- 
  parsnip::rand_forest(min_n = tune(), trees = 12) %>%
  parsnip::set_engine(
    "ranger",
    regularization.factor = tune(),
    regularization.usedepth = tune()
  ) %>%
  set_mode("regression")

set.seed(2893)
tune_grid(
  rf,
  mpg ~ ., 
  bootstraps(mtcars, times = 5), 
  grid = 4
) %>%
  collect_metrics()
#> → A | error:   The regularization coefficients cannot be smaller than 0.
#> There were issues with some computations   A: x5
#> # A tibble: 6 × 9
#>   min_n regularization.factor regularization.usedepth .metric .estimator    mean
#>   <int>                 <dbl> <lgl>                   <chr>   <chr>        <dbl>
#> 1    14                 0.667 FALSE                   rmse    standard     3.47 
#> 2    14                 0.667 FALSE                   rsq     standard     0.804
#> 3    27                 1     TRUE                    rmse    standard     4.01 
#> 4    27                 1     TRUE                    rsq     standard     0.761
#> 5    40                 0.333 FALSE                   rmse    standard     6.77 
#> 6    40                 0.333 FALSE                   rsq     standard   NaN    
#> # ℹ 3 more variables: n <int>, std_err <dbl>, .config <chr>

rf %>% extract_parameter_set_dials() %>% grid_space_filling()
#> # A tibble: 5 × 3
#>   min_n regularization.factor regularization.usedepth
#>   <int>                 <dbl> <lgl>                  
#> 1     2                  0.75 FALSE                  
#> 2    11                  0.5  TRUE                   
#> 3    21                  0    FALSE                  
#> 4    30                  1    TRUE                   
#> 5    40                  0.25 TRUE
rf %>% extract_parameter_set_dials() %>% grid_latin_hypercube()
#> Warning: `grid_latin_hypercube()` was deprecated in dials 1.3.0.
#> ℹ Please use `grid_space_filling()` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
#> # A tibble: 3 × 3
#>   min_n regularization.factor regularization.usedepth
#>   <int>                 <dbl> <lgl>                  
#> 1    16                 0.442 TRUE                   
#> 2    36                 0.784 TRUE                   
#> 3    11                 0.183 FALSE

Created on 2024-08-05 with reprex v2.1.1

Note that the error is from the ranger engine itself:

ranger::ranger(mpg ~ ., mtcars, regularization.factor = 0)
#> Error in ranger(mpg ~ ., mtcars, regularization.factor = 0) : 
#>  The regularization coefficients cannot be smaller than 0.
simonpcouch commented 3 months ago

Related to https://github.com/tidymodels/tune/pull/919 and https://github.com/tidymodels/finetune/pull/117.