Logarithmic spacing in grid search during tuning

gleb-roma commented 2 years ago

When tuning LASSO, I didn't find a way to specify the grid with logarithmic spacing, even though it seems natural to me. The default is equal spacing.

library(DoubleML)
library(mlr3)
library(paradox)
library(mlr3tuning)

# set logger to omit messages during tuning and fitting
lgr::get_logger("mlr3")$set_threshold("warn")
lgr::get_logger("bbotk")$set_threshold("warn")

set.seed(3141)
n_obs = 500
n_vars = 100
theta = rep(3, 3)
# generate matrix-like objects and use the corresponding wrapper
X = matrix(stats::rnorm(n_obs * n_vars), nrow = n_obs, ncol = n_vars)
y = X[, 1:3, drop = FALSE] %*% theta  + stats::rnorm(n_obs)
df = data.frame(y, X)

doubleml_data = double_ml_data_from_data_frame(df,
                                               y_col = "y",
                                               d_cols = c("X1"),
                                               x_cols = c("X2","X3"))

set.seed(1234)
ml_g = lrn("regr.glmnet")
ml_m = lrn("regr.glmnet")
doubleml_plr = DoubleMLPLR$new(doubleml_data, ml_g, ml_m)

par_grids = list(
  "ml_g" = ParamSet$new(list(
    ParamDbl$new("lambda", lower = 0.0001, upper = 10))),  # I WANT LOGARITHMIC SPACING HERE, eg. 1e-5, 1e-4, 1e-3, etc
  "ml_m" =  ParamSet$new(list(
    ParamDbl$new("lambda", lower = 0.05, upper = 0.1))))

tune_settings = list(terminator = trm("evals", n_evals = 100),
                     algorithm = tnr("grid_search", resolution = 11),
                     rsmp_tune = rsmp("cv", folds = 5),
                     measure = list("ml_g" = msr("regr.mse"),
                                    "ml_m" = msr("regr.mse")))

doubleml_plr$tune(param_set = par_grids, tune_settings = tune_settings)

doubleml_plr$tuning_res

# BUT THE SPACING ON THE GRID IS LINEAR
doubleml_plr$tuning_res$X1$ml_g[[1]]$tuning_result[[1]]$tuning_archive %>% arrange(lambda)

MalteKurz commented 2 years ago

The tuning is based on functionalities from mlr3tuning. Therefore, https://mlr3book.mlr-org.com/optimization.html#optimization is often a good resource for such questions.

One way to achieve the requested logarithmic spacing is by applying transformations (see also https://mlr3book.mlr-org.com/optimization.html#tuning). To be more specific, you can use

par_grids = list(
  "ml_g" = paradox::ps(lambda=paradox::p_dbl(-5, 1, trafo = function(x) 10^x)),
  "ml_m" =  ParamSet$new(list(
    ParamDbl$new("lambda", lower = 0.05, upper = 0.1))))

tune_settings = list(terminator = trm("evals", n_evals = 100),
                     algorithm = tnr("grid_search", resolution = 7),
                     rsmp_tune = rsmp("cv", folds = 5),
                     measure = list("ml_g" = msr("regr.mse"),
                                    "ml_m" = msr("regr.mse")))

doubleml_plr$tune(param_set = par_grids, tune_settings = tune_settings)

doubleml_plr$tuning_res

doubleml_plr$tuning_res$X1$ml_g[[1]]$tuning_result[[1]]$tuning_archive %>% arrange(lambda)

This way you get the requested logarithmic spacing for the lambda parameter values. Note that in doubleml_plr$tuning_res$X1$ml_g[[1]]$tuning_result[[1]]$tuning_archive and doubleml_plr$tuning_res$X1$ml_g[[1]]$tuning_result[[1]]$tuning_result the lambda entries seem to be pre-transformation (basically the exponents of trafo = function(x) 10^x). In contrast the x_domain / learner_param_vals columns seem to contain the transformed / actual parameter values. Note that we here just pass-through the tuning result and archive from mlr3tuning where the tables are filled this way.

gleb-roma commented 2 years ago

This is great, thank you!

DoubleML / doubleml-for-r

Logarithmic spacing in grid search during tuning #123