simonpcouch / website

Source code for my personal website!
http://www.simonpcouch.com
5 stars 0 forks source link

tuning workflow sets with specific grids (es. grid_latin_hypercube etc) #4

Closed Bendlexane closed 7 months ago

Bendlexane commented 7 months ago

Hi!

Thanks a lot for the amazing work and documentation with tidymodels that you and your collegues have been doing!

I'm having some difficulties to use custom grids in workflow sets. How can i provide my own grid in workflowset_map?

This MRE may help

data(diabetes_pima)
diabetes_recipe <- 
  recipe(diabetes ~  ., data = diabetes_train) #plus other steps

#models specifications
svm_linear_spec 
svm_poly_spec 
svm_rbf_spec 

#workflow set
diabetes_workflow_set <-
  workflow_set(preproc=list(diabetes_recipe),
               models=list(svm_linear = svm_linear_spec, 
                    svm_poly = svm_poly_spec, 
                    svm_rbf= svm_rbf_spec), cross = FALSE) 

#grid latin hypercube of 

svm_grid <- grid_latin_hypercube(
                         svm_margin(), 
                         rbf_sigma(), 
                         scale_factor(), 
                         degree(),
                         cost(), 
                          size = 100)

fit_workflows_set <-
  workflow_map(diabetes_workflow_set,
    seed = 22, ## replicability 
    fn = "tune_grid",
    control = grid_ctrl,
    resamples = cv_folds,
    metrics = metric_set(roc_auc,accuracy, bal_accuracy),
    verbose=TRUE,
    grid= svm_grid
  )  
#run
Error in check_grid(grid = grid, workflow = workflow, pset = pset) :   The provided `grid` has the following parameter columns that have not been marked for tuning by `tune()`: 'rbf_sigma', 'scale_factor', 'degree'.

How can I add to the grid my grid? Thank you again

Hope that this can be useful to other users too.

Best wishes

simonpcouch commented 7 months ago

Thanks for the issue! I don't have access to your svm_linear_spec, svm_poly_spec, or svm_rbf_spec objects, and the error seems to indicate that that's where the issue arose from. Can you please provide a minimal reprex (reproducible example)? A reprex will help me troubleshoot and fix your issue more quickly.🙂

Bendlexane commented 7 months ago

Hi!

Thank you for your answer! 😁 The full script can be found here

https://github.com/TADABWorkshop/Course_material/blob/91c02d84c1b59acbcfe0eed49b930e8252acb515/Application%20of%20machine%20learning%20in%20biological%20data%20analysis%20and%20exploration/scripts/Practical%203%20-%20Support%20Vector%20Machines.Rmd

The lines of code in question are: https://github.com/TADABWorkshop/Course_material/blob/91c02d84c1b59acbcfe0eed49b930e8252acb515/Application%20of%20machine%20learning%20in%20biological%20data%20analysis%20and%20exploration/scripts/Practical%203%20-%20Support%20Vector%20Machines.Rmd#L333-L382

The dataset is from UCI ML Pima Diabetes

Best Regards! 😇

simonpcouch commented 7 months ago

Ah, I see. Note that your svm_grid contains a grid for 5 different tuning parameters, while each of your _specs only have 2 or 3 of those parameters. You will want to only pass the relevant columns in that grid to each specification individually using option_add(). See the examples in the help-file, specifically the line reading two_class_set %>% option_add(grid = 50, id = "none_cart").