tidymodels / dials

Tools for creating tuning parameter values
https://dials.tidymodels.org/
Other
114 stars 27 forks source link

parameter constraints #334

Open topepo opened 4 months ago

topepo commented 4 months ago

I want to add some constraints to parameter sets to filter out some grid points based on user-defined constraints:

and so on.

My thoughts are that we could have an optional list of expressions attached to a parameter set, and these filters could be executed when grids are created.

topepo commented 2 weeks ago

I'm doing some work on this. Currently, I have an interface for including an expression attribute in the parameter set. I know that we try to avoid attributes but I do see a better way to add one.

I've also done a little work on using it with grids so you can see that it works in this example:

# pak::pak("tidymodels/dials@constraints"), ask = FALSE)
library(tidymodels)

no_constr_prm <- parameters(lambda = penalty(), mixture(), num_terms(c(1, 10)))

has_constraint(no_constr_prm)
#> [1] FALSE

constr_prm <- add_parameter_constraint(no_constr_prm, lambda < 0.01)
constr_prm
#> Collection of 3 parameters for tuning
#> 
#>  identifier      type    object
#>     mixture   mixture nparam[+]
#>      lambda   penalty nparam[+]
#>   num_terms num_terms nparam[+]
#> 
#> Parameter constraint: `lambda < 0.01`.
has_constraint(constr_prm)
#> [1] TRUE

remove_parameter_constraint(constr_prm)
#> Collection of 3 parameters for tuning
#> 
#>  identifier      type    object
#>     mixture   mixture nparam[+]
#>      lambda   penalty nparam[+]
#>   num_terms num_terms nparam[+]
update_parameter_constraint(constr_prm, log10(lambda) < -2 & mixture > 1 / 2)
#> Collection of 3 parameters for tuning
#> 
#>  identifier      type    object
#>     mixture   mixture nparam[+]
#>      lambda   penalty nparam[+]
#>   num_terms num_terms nparam[+]
#> 
#> Parameter constraint: `log10(lambda) < -2 & mixture > 1 / 2`.

is_even_exp <- quote(num_terms %% 2 == 0)
even_prm <- add_parameter_constraint(no_constr_prm, !!is_even_exp)
even_prm
#> Collection of 3 parameters for tuning
#> 
#>  identifier      type    object
#>     mixture   mixture nparam[+]
#>      lambda   penalty nparam[+]
#>   num_terms num_terms nparam[+]
#> 
#> Parameter constraint: `num_terms %% 2 == 0`.

set.seed(1)
grid_random(even_prm, size = 100) %>% count(num_terms)
#> # A tibble: 5 × 2
#>   num_terms     n
#>       <int> <int>
#> 1         2     8
#> 2         4    13
#> 3         6     7
#> 4         8    11
#> 5        10    11

Created on 2024-11-08 with reprex v2.1.1

From a design point of view, I think that the best thing to do would be to add an additional argument to new_param_grid() for the constraint so that it would be universally used. Otherwise, we'd have to change the six make_*_grid() functions.

My current would be to make a set of PRs:

  1. Add only the constraint code to parameter objects.
  2. Update new_param_grid()
  3. Mark the filter argument in grid_random() and grid_regular() as deprecated (it is not in the others).
  4. No updates are needed for tune but finetune would need to update to its new_in_neighborhood() code to produce new candidates without violating the constraints.

The finetune changes might be a moderate refactor since we need all of the parameters to test the constraints. Right now, for each parameter, it generates a lot of candidates for each predictor, picks a single combination, and then updates the current grid. While it is not slow, it could probably use a design re-evaluation.

EmilHvitfeldt commented 1 week ago

I know that we try to avoid attributes but I do see a better way to add one.

i'm assuming you mean "but i don't see a better way to add one". I feel attributes are fine in this case.

are we going to force users to create one long expression for the contrains, or allow them to pass in multiple constraints.

are there times where we (tidymodels) can apply constraints to a grid based on the information we have about a workflow? if yes i think it would be neat to do so.

EmilHvitfeldt commented 1 week ago

i know we are getting into non-deterministic stuff, but the below output of grid_random() produced 50 combinations instead of 100 as was asked.

on one hand it is straight forward to sample to get 100 that satisfy the condition. but there are many times where it isn't possible to get the required unique cases. We already see that right now.