I need a solution. error:'Some tuning parameters require finalization but there are recipe parameters that require tuning'

I am using the sample code written in the top page markdown. When I try to tune ranger, I get the following error.


regularized_spec <- 
  linear_reg(penalty = tune(), mixture = tune()) %>% 
  set_engine("glmnet")

cart_spec <- 
  decision_tree(cost_complexity = tune(), min_n = tune()) %>% 
  set_engine("rpart") %>% 
  set_mode("regression")

rf_spec = rand_forest(mtry = tune(), trees = 50, min_n = tune()) %>% 
  set_engine("ranger") %>% 
  set_mode("regression")

chi_models <- 
  workflow_set(
    preproc = list(simple = base_recipe, 
                   filter = filter_rec, 
                   pca = pca_rec),
    models = list(glmnet = regularized_spec, 
                  cart = cart_spec, 
                  rf = rf_spec),
    cross = TRUE
  )

The error code is as follows

i 1 of 7 tuning:     simple_glmnet
√ 1 of 7 tuning:     simple_glmnet (15.4s)
i 2 of 7 tuning:     simple_cart
√ 2 of 7 tuning:     simple_cart (17.4s)
i 3 of 7 tuning:     simple_rf
i Creating pre-processing data to finalize unknown parameter: mtry
√ 3 of 7 tuning:     simple_rf (4m 11.9s)
i 4 of 7 tuning:     filter_cart
√ 4 of 7 tuning:     filter_cart (27.3s)
i 5 of 7 tuning:     filter_rf
x 5 of 7 tuning:     filter_rf failed with: Some tuning parameters require finalization but there are recipe parameters that require tuning. Please use `parameters()` to finalize the parameter ranges.
i 6 of 7 tuning:     pca_cart
√ 6 of 7 tuning:     pca_cart (22.7s)
i 7 of 7 tuning:     pca_rf
x 7 of 7 tuning:     pca_rf failed with: Some tuning parameters require finalization but there are recipe parameters that require tuning. Please use `parameters()` to finalize the parameter ranges.

> rand_forest
function (mode = "unknown", mtry = NULL, trees = NULL, 
    min_n = NULL) 
{
    args <- list(mtry = enquo(mtry), trees = enquo(trees), min_n = enquo(min_n))
    new_model_spec("rand_forest", args = args, eng_args = NULL, 
        mode = mode, method = NULL, engine = NULL)
}

This looks like you need to finalize your filter_rec and pca_rec recipes before you can tune them in a workflow. It's hard to say anything specific because your example isn't reproducible but maybe ?parameters.recipe will already set you on the right course. If not, please provide a minimal reproducible example (a reprex). The reprex package is very helpful for that and has additional advice on how to create a good reprex at https://reprex.tidyverse.org/articles/reprex-dos-and-donts.html

how about this ?

library(tidymodels)
library(workflowsets)

data(Chicago)

Chicago <- Chicago %>% slice(1:365)

base_recipe <- 
  recipe(ridership ~ ., data = Chicago) %>% 
  step_date(date) %>% 
  step_holiday(date) %>% 
  update_role(date, new_role = "id") %>% 
  step_dummy(all_nominal()) %>% 
  step_zv(all_predictors()) %>% 
  step_normalize(all_predictors())

filter_rec <- 
  base_recipe %>% 
  step_corr(all_of(stations), threshold = tune())

pca_rec <- 
  base_recipe %>% 
  step_pca(all_of(stations), num_comp = tune()) %>% 
  step_normalize(all_predictors())

regularized_spec <- 
  linear_reg(penalty = tune(), mixture = tune()) %>% 
  set_engine("glmnet")

cart_spec <- 
  decision_tree(cost_complexity = tune(), min_n = tune()) %>% 
  set_engine("rpart") %>% 
  set_mode("regression")

rf_spec = rand_forest(mtry = tune(), trees = 50, min_n = tune()) %>% 
  set_engine("ranger") %>% 
  set_mode("regression")

chi_models <- 
  workflow_set(
    preproc = list(simple = base_recipe, 
                   filter = filter_rec, 
                   pca = pca_rec),
    models = list(glmnet = regularized_spec, 
                  cart = cart_spec, 
                  rf = rf_spec),
    cross = TRUE
  )

chi_models <- 
  chi_models %>% 
  anti_join(tibble(wflow_id = c("pca_glmnet", "filter_glmnet")), 
            by = "wflow_id")

splits <- 
  sliding_period(
    Chicago,
    date,
    "day",
    lookback = 300,   
    assess_stop = 7, 
    step = 7 
  )

set.seed(123)
chi_models <- 
  chi_models %>% 
  workflow_map("tune_grid", resamples = splits, grid = 5, 
               metrics = metric_set(mae), verbose = TRUE)

autoplot(chi_models)

i 1 of 7 tuning:     simple_glmnet
√ 1 of 7 tuning:     simple_glmnet (16.6s)
i 2 of 7 tuning:     simple_cart
√ 2 of 7 tuning:     simple_cart (17.3s)
i 3 of 7 tuning:     simple_rf
i Creating pre-processing data to finalize unknown parameter: mtry
√ 3 of 7 tuning:     simple_rf (18.2s)
i 4 of 7 tuning:     filter_cart
√ 4 of 7 tuning:     filter_cart (29s)
i 5 of 7 tuning:     filter_rf
x 5 of 7 tuning:     filter_rf failed with: Some tuning parameters require finalization but there are recipe parameters that require tuning. Please use `parameters()` to finalize the parameter ranges.
i 6 of 7 tuning:     pca_cart
√ 6 of 7 tuning:     pca_cart (23.9s)
i 7 of 7 tuning:     pca_rf
x 7 of 7 tuning:     pca_rf failed with: Some tuning parameters require finalization but there are recipe parameters that require tuning. Please use `parameters()` to finalize the parameter ranges.

The problem is that mtry is based on the number of predictors columns. tune_grid() tries to figure this out and set the range for mtry.

For simple_rf, it can do this.

For the other cases, it cannot because the recipe has tuning parameters. For example, tune_grid() would need to know the number of PCA components to be able to set mtry. I can't since that is being tuned.

The error message is not great here (and we'll fix that). The solution is to create your own grid for those two workflows and pass them is using option_add(). For example:

# Get the ones that failed
chi_models_fixed <- 
  chi_models %>% 
  filter(wflow_id %in% c("filter_rf", "pca_rf"))

# Make grids by declaring parameter ranges
set.seed(1)
filter_grid <- 
  chi_models %>% 
  pull_workflow("filter_rf") %>% 
  parameters() %>% 
  # Set a range for mtry: 
  update(mtry = mtry(c(1, 20))) %>% 
  grid_latin_hypercube(size = 10)

set.seed(1)
pca_grid <- 
  chi_models %>% 
  pull_workflow("pca_rf") %>% 
  parameters() %>% 
  # Set a range for num_comp and mtry: 
  update(
    num_comp = num_comp(c(1, 10)),
    mtry = mtry(c(1, 20))
  ) %>% 
  grid_latin_hypercube(size = 10)

# Run the modified grids
chi_models_fixed <- 
  chi_models_fixed %>% 
  option_add(grid = filter_grid, id = "filter_rf") %>% 
  option_add(grid = pca_grid, id = "pca_rf") %>%
  workflow_map("tune_grid", resamples = splits, 
               metrics = metric_set(mae), verbose = TRUE)

# put them back together: 
chi_models <- 
  chi_models %>% 
  filter(!(wflow_id %in% c("filter_rf", "pca_rf"))) %>% 
  bind_rows(chi_models_fixed)

ˆ'll transfer this to the tune repo and add more documentation.

This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

tidymodels / tune

I need a solution. error:'Some tuning parameters require finalization but there are recipe parameters that require tuning' #387