Closed xiaochi-liu closed 1 year ago
It turns out that not all the models can support 25 different configurations. For example, we don't tune over 25 interaction degrees 😱 for the MARS model (two is much more sensible), the bagged tree isn't tuneable at all, etc:
collect_metrics(grid_results, summarize = FALSE) %>%
group_by(wflow_id) %>%
summarise(num_models = n_distinct(.config))
#> # A tibble: 12 × 2
#> wflow_id num_models
#> <chr> <int>
#> 1 boosting 25
#> 2 CART 25
#> 3 CART_bagged 1
#> 4 Cubist 25
#> 5 full_quad_KNN 25
#> 6 full_quad_linear_reg 25
#> 7 KNN 25
#> 8 MARS 2
#> 9 neural_network 25
#> 10 RF 24
#> 11 SVM_poly 25
#> 12 SVM_radial 25
You are correct about dividing by two, though. There is another related change that is also needed so I'll add that.
Thank you very much Julia!
That totally makes sense. I didn't figure out how to display the number of configurations tuned for each model. Now I understand.
This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.
Each model has two metrics:
rmse
andrsq
. Thus, for the number of models, I think it should benum_grid_models <- nrow(collect_metrics(grid_results, summarize = FALSE)) / 2
, namely 12,600 models?Also, I'm a little bit confused: in this example, we are using:
Why the total number of models evaluated is not $5 \times 10 \times 25 \times 12 = 15,000$?
Thank you very much for your kind guidance!