mlr-org / mlr3tuning

Hyperparameter optimization package of the mlr3 ecosystem
https://mlr3tuning.mlr-org.com/
GNU Lesser General Public License v3.0
53 stars 5 forks source link

Integer parameters when using a transformation #295

Closed py9mrg closed 3 years ago

py9mrg commented 3 years ago

Hello,

I'm not sure this is an issue as such, but might help other users and I wonder if it's worth considering making this more automatic...

Let's say I am tuning an integer parameter such as minsplit from rpart:

search_space <- ps(
    minsplit = p_int(lower = 10, upper = 1000)
)

If I want to use a logarithmic search space then I can use a transformation such as:

search_space <- ps(
    minsplit = p_int(lower = 1, upper = 3,  trafo = function(x) 10^x)
)

The issue here is that it limits the selection of lower and upper to integer values - i.e. I can only end up with the final minsplit being 10, 100, 1000 and nothing in between. If I set lower = 0.5 then I get an error.

One way around this is to switch p_int to p_dbl:

search_space <- ps(
    minsplit = p_dbl(lower = 0.5 upper = 3,  trafo = function(x) 10^x)
)

Now I don't get an error from ps but I do get an error later on when doing the tuning (rpart throws an error about non-integer minsplit). So the final solution is then as follows:

search_space <- ps(
    minsplit = p_dbl(lower = 0.5 upper = 3, trafo = as.integer(round(10^x)))
)

This works fine, so isn't really an issue.

But... it seems a little disjointed to me that we have to declare a double parameter when it's really an integer to the learner. I wonder if it might be worth having p_int automatically force an integer from the trafo argument and/or allow p_int to accept non-integer values in lower / upper? Perhaps an argument such as convert_integer = TRUE would allow non-integer values to lower/upper and automatically uses as.integer(round()) (as.integer always truncates) or at least warns the user to make sure the trafo function returns an integer, rather than throw an error?

be-marc commented 3 years ago

Sorry for the late reply. to_tune(), p_dbl(), and p_int() have the logscale argument now for tuning on a logarithmic scale. This should make it much easier.

search_space = ps(
    minsplit = p_int(lower = 1, upper = 1000,  logscale = TRUE)
)

design = paradox::generate_design_random(search_space, 4)
design$transpose()

# > [[1]]
# > [[1]]$minsplit
# > [1] 618
# > 
# > 
# > [[2]]
# > [[2]]$minsplit
# > [1] 3
# > 
# > 
# > [[3]]
# > [[3]]$minsplit
# > [1] 226
# > 
# > 
# > [[4]]
# > [[4]]$minsplit
# > [1] 327

If you have any further questions, please reopen.

py9mrg commented 3 years ago

Ah excellent, thank you.