grid of mtry values while training random forests with ranger

Hello.

I am working with a subset of the 'Ames Housing' dataset and have originally 17 features. Using the 'recipes' package, I have preprocessed the original features and created dummy variables for nominal predictors with the following code. That has resulted in 36 features in the 'baked_train' dataset below.

blueprint <- recipe(Sale_Price ~ ., data = _train) %>%
step_nzv(Street, Utilities, Pool_Area, Screen_Porch, Misc_Val) %>% step_impute_knn(Gr_Liv_Area) %>% step_integer(Overall_Qual) %>% step_normalize(all_numeric_predictors()) %>% step_other(Neighborhood, threshold = 0.01, other = "other") %>% step_dummy(all_nominal_predictors(), one_hot = FALSE)

prepare <- prep(blueprint, data = ames_train)

baked_train <- bake(prepare, new_data = ames_train)

baked_test <- bake(prepare, new_data = ames_test)**

Now, I am trying to train random forests with the 'ranger' package using the following code.

cv_specs <- trainControl(method = "repeatedcv", number = 5, repeats = 5)

param_grid_rf <- expand.grid(mtry = seq(1, 36, 1), splitrule = "variance", min.node.size = 2)

rf_cv <- train(blueprint, data = ames_train, method = "ranger", trControl = cv_specs, tuneGrid = param_grid_rf, metric = "RMSE")

Notice that I have set the grid of 'mtry' values based on the number of features in the 'baked_train' data. It is my understanding that 'caret' will apply the blueprint within each resample of 'ames_train' creating a baked version at each CV step.

The text Hands-On Machine Learning with R by Boehmke & Greenwell says on section 3.8.3,

Consequently, the goal is to develop our blueprint, then within each resample iteration we want to apply prep() and bake() to our resample training and validation data. Luckily, the caret package simplifies this process. We only need to specify the blueprint and caret will automatically prepare and bake within each resample.

However, when I run the code above I get an error,

mtry can not be larger than number of variables in data. Ranger will EXIT now.

I get the same error when I specify 'tuneLength = 20' instead of the 'tuneGrid'. Although the code works fine when the grid of 'mtry' values is specified to be from 1 to 17 (the number of features in the original training data 'ames_train').

Can you please point out what I am missing here? Specifically, why do I have to specify the number of features in 'ames_train' instead of 'baked_train' when essentially 'caret' is supposed to create a baked version before fitting and evaluating the model for each resample?

Thanks.

topepo / caret

grid of mtry values while training random forests with ranger #1290