mlr-org / mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem
https://mlr3tuning.mlr-org.com/
GNU Lesser General Public License v3.0
53 stars 5 forks source link

bug? non-linear grid definition throws error on boundary checks #272

Closed kkmann closed 3 years ago

kkmann commented 3 years ago

Hi,

I might have misunderstood something, but I am running into issues with non-linear tuning grids. I am trying to define a grid with unequal spacing, e.g. the equivalent of exp(seq(log(1), log(5), length.out = 10)). My understanding was that I need to define the parameters on the log scale and then use tranforms (exp in this case).

I do run into problems with the transformed parameters not respecting the boundaries of the parameter space though. That does not seem to be right... I would expect the checks to be performed on the log scale, not the transformed scale. The problem is not learner specific which is why I felt this is the right place to address it. Below is an in-depth example using xgb and rf.

library(tidyverse)
library(mlr3)
library(mlr3measures)

# define dummy classification task
data("mtcars", package = "datasets")
task_mtcars <- TaskClassif$new(
    id = "cars",
    backend = mtcars %>% mutate(am = factor(am)),
    target = "am"
)
task_mtcars

# define xg boost auto learner
lrn_xgb <- lrn("classif.xgboost", predict_type = "prob")
lrn_xgb$param_set$values <- list(
             booster = "gbtree",
           objective = "binary:logistic",
           subsample = 0.66,
    colsample_bytree = 0.66,
             nrounds = 10
)
lrn <- GraphLearner$new(
    po("scale") %>>%
    po("encode", method = "treatment", affect_columns = selector_type("factor")) %>>%
    po("learner", learner = lrn_xgb)
)
# tuning space, on log scale
search_space <- ParamSet$new(list(
    ParamDbl$new("classif.xgboost.eta", lower = log(.01), upper = log(.5)),
    ParamDbl$new("classif.xgboost.gamma", lower = log(.01), upper = log(100))
))
# transform back to original scale
search_space$trafo <- function(x, param_set) {
      x$classif.xgboost.eta <- exp(x$classif.xgboost.eta)
    x$classif.xgboost.gamma <- exp(x$classif.xgboost.gamma)
    return(x)
}
metric <- msr("classif.auc")
lrn_auto <- AutoTuner$new(
    lrn,
    resampling = rsmp("cv", folds = 3), # quick and dirty
    measure = metric,
    search_space = search_space,
    terminator = trm("none"),
    tuner = mlr3tuning::tnr(
        "grid_search",
        param_resolutions = c(
              classif.xgboost.eta = 2,
            classif.xgboost.gamma = 2
        ),
        batch_size = 999
    )
)

lrn_auto$train(task_mtcars)
# error:
# Error in .__Archive__add_evals(self = self, private = private, super = super,  : 
#                                    Assertion on 'xdt[, self$cols_x, with = FALSE]' failed: classif.xgboost.gamma: Element 1 is not <= 4.60517.

# tuning space, on original scale
search_space <- ParamSet$new(list(
    ParamDbl$new("classif.xgboost.eta", lower = .01, upper = .5),
    ParamDbl$new("classif.xgboost.gamma", lower = .01, upper = 100)
))
lrn_auto <- AutoTuner$new(
    lrn,
    resampling = rsmp("cv", folds = 3), # quick and dirty
    measure = metric,
    search_space = search_space,
    terminator = trm("none"),
    tuner = mlr3tuning::tnr(
        "grid_search",
        param_resolutions = c(
            classif.xgboost.eta = 2,
            classif.xgboost.gamma = 2
        ),
        batch_size = 999
    )
)
# no problem
lrn_auto$train(task_mtcars)

# let's see how a different learner does
lrn_rf <- lrn("classif.ranger", predict_type = "prob")
lrn_rf$param_set$values <- list(
    num.trees = 100
)
lrn <- GraphLearner$new(
    po("scale") %>>%
    po("encode", method = "treatment", affect_columns = selector_type("factor")) %>>%
    po("learner", learner = lrn_rf)
)
# tuning space, on sqrt scale
search_space <- ParamSet$new(list(
    ParamInt$new("classif.ranger.mtry", lower = 2, upper = 5)
))
# transform back to original scale
search_space$trafo <- function(x, param_set) {
    x$classif.ranger.mtry <- x$classif.ranger.mtry^2
    return(x)
}
lrn_auto <- AutoTuner$new(
    lrn,
    resampling = rsmp("cv", folds = 3), # quick and dirty
    measure = metric,
    search_space = search_space,
    terminator = trm("none"),
    tuner = mlr3tuning::tnr(
        "grid_search",
        param_resolutions = c(
            classif.ranger.mtry = 2
        ),
        batch_size = 999
    )
)
lrn_auto$train(task_mtcars)
# error:
# Error in ranger::ranger(dependent.variable.name = task$target_names, data = task$data(),  : 
#                             User interrupt or internal error.

# if we drop the transform, it works again
search_space <- ParamSet$new(list(
    ParamInt$new("classif.ranger.mtry", lower = 2, upper = 5)
))
# transform back to original scale
search_space$trafo <- function(x, param_set) {
    x$classif.ranger.mtry <- x$classif.ranger.mtry
    return(x)
}
lrn_auto <- AutoTuner$new(
    lrn,
    resampling = rsmp("cv", folds = 3), # quick and dirty
    measure = metric,
    search_space = search_space,
    terminator = trm("none"),
    tuner = mlr3tuning::tnr(
        "grid_search",
        param_resolutions = c(
            classif.ranger.mtry = 2
        ),
        batch_size = 999
    )
)
lrn_auto$train(task_mtcars)
> sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin19.5.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /usr/local/Cellar/openblas/0.3.10_1/lib/libopenblasp-r0.3.10.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] mlr3measures_0.2.0      xgboost_1.2.0.1         drake_7.12.5            future.batchtools_0.9.0
 [5] future_1.19.1           paradox_0.4.0           mlr3tuning_0.3.0.9000   mlr3pipelines_0.2.1    
 [9] mlr3learners_0.3.0      mlr3_0.7.0              data.table_1.13.0       forcats_0.5.0          
[13] stringr_1.4.0           dplyr_1.0.1             purrr_0.3.4             readr_1.3.1            
[17] tidyr_1.1.1             tibble_3.0.3            ggplot2_3.3.2           tidyverse_1.3.0        

loaded via a namespace (and not attached):
 [1] fs_1.4.2           lubridate_1.7.9    filelock_1.0.2     bbotk_0.2.1        progress_1.2.2    
 [6] httr_1.4.1         tools_4.0.2        backports_1.1.10   R6_2.4.1           DBI_1.1.0         
[11] colorspace_1.4-1   withr_2.2.0        mlr3misc_0.5.0     tidyselect_1.1.0   prettyunits_1.1.1 
[16] compiler_4.0.2     cli_2.0.2          rvest_0.3.5        lgr_0.3.4          xml2_1.3.2        
[21] scales_1.1.1       checkmate_2.0.0    rappdirs_0.3.1     digest_0.6.25      txtq_0.2.3        
[26] rmarkdown_2.3      pkgconfig_2.0.3    htmltools_0.5.0    dbplyr_1.4.4       rlang_0.4.7       
[31] readxl_1.3.1       rstudioapi_0.11    generics_0.0.2     jsonlite_1.7.0     magrittr_1.5      
[36] Matrix_1.2-18      Rcpp_1.0.5         munsell_0.5.0      fansi_0.4.1        lifecycle_0.2.0   
[41] stringi_1.4.6      yaml_2.2.1         snakecase_0.11.0   storr_1.2.1        grid_4.0.2        
[46] blob_1.2.1         parallel_4.0.2     listenv_0.8.0      crayon_1.3.4       lattice_0.20-41   
[51] haven_2.3.1        hms_0.5.3          batchtools_0.9.13  knitr_1.29         pillar_1.4.6      
[56] ranger_0.12.1      igraph_1.2.5       uuid_0.1-4         base64url_1.4      future.apply_1.6.0
[61] codetools_0.2-16   reprex_0.3.0       precrec_0.11.2     glue_1.4.1         evaluate_0.14     
[66] renv_0.11.0        modelr_0.1.8       vctrs_0.3.2        cellranger_1.1.0   gtable_0.3.0      
[71] assertthat_0.2.1   xfun_0.15          janitor_2.0.1      broom_0.7.0        globals_0.13.0    
[76] ellipsis_0.3.1     brew_1.0-6       
be-marc commented 3 years ago

The checks are performed on the right scale. However, TunerGridSearch produces values slightly higher than the upper bound.

options(digits=16)

self$upper
## > [1]  4.605170185988092
x
## > [1]  4.605170185988093

self$upper is the upper bound defined by the search space and x is the proposed value by the tuner.

@mllg @mb706 checkmate::checkNumber() returns FALSE in paradox::ParamDbl$.check(). Can we relax this check or do we have to change generate_design_grid()?

kkmann commented 3 years ago

Uha, I guess the boundaries should be respected stricly - some algorithms could be sensitive to that. I feel like generate_design_grid() should gurantee that the paramters are within the defined boundaries to avoid trouble downstream.

be-marc commented 3 years ago

If you are okay with this for you current application, you can disable the check by setting check_values = FALSE in AutoTuner. You need the latest mlr3tuning dev version for this. But we still need to fix this.

kkmann commented 3 years ago

Thanks, that'll help already. So, the problem is the qunif method not respecting the defined boundaries then?

https://github.com/mlr-org/paradox/blob/d27e51620041c552270d41ad65349ce763aba1d4/R/generate_design_grid.R#L57

Is there any special consideration needed for integer parameters? Currently, I just define them as doubles and then make sure that the transformed parameters are integers. Conceptually not ideal, but seems to work (and avoides these numerical boundary issues).

berndbischl commented 3 years ago

i will look into this, thanks for the post

Is there any special consideration needed for integer parameters? Currently, I just define them as doubles and then make sure that the transformed parameters are integers. Conceptually not ideal, but seems to work (and avoides these numerical boundary issues).

that's the recommended approach currently / and i guess in general, in many other frameworks. # that should work "well (enough)" if you have a longer int range / interval.

if you only have 2 or 3 int values, you can also treat it as a categorical, so as a ParamFct. but note that the will then think the values are unordered and cannot learn / exploit order.

be-marc commented 3 years ago
library(paradox)

search_space = ParamSet$new(list(
  ParamDbl$new("gamma", lower = log(.01), upper = log(100))
))

data = generate_design_grid(search_space, param_resolutions = c(gamma = 2))$data

search_space$check_dt(data)

## >  "gamma: Element 1 is not <= 4.60517"
be-marc commented 3 years ago

So, the problem is the qunif method not respecting the defined boundaries then?

Yes, the calculation in qunif is x * (self$upper - self$lower) + self$lower. It fails with

1 * (log(100) - log(.01)) + log(.01) == log(100)
## > [1] FALSE
kkmann commented 3 years ago

Thanks for the swift responses 🎉, just squeee the values in the defined boundaries with pmax/min then (for very fine sampling it might affect not just the boundary values)?

berndbischl commented 3 years ago

its the old machine accuracy bug - i teach that in lectures :( we have to make our test simply less strict

be-marc commented 3 years ago

Fixed by https://github.com/mlr-org/paradox/commit/e9dfe420a3ceaac70a4f3419caf2bca5ccb17507

berndbischl commented 3 years ago

@kkmann thx a lot for the great report. that was an important thing to fix!

berndbischl commented 3 years ago

also to explain what we did, for reference: in the ParamDbl-range-check we now do this: instead checking that all param values x are in [lower, upper] we check that they are in [lower-eps, upper+eps] where eps is basically the single precision machine epsilon, which is in the order of 1e-8