mlr-org / ParamHelpers

Helpers for parameters in black-box optimization, tuning and machine learning.
https://paramhelpers.mlr-org.com
Other
25 stars 9 forks source link

subset parameter #208

Closed rcannood closed 5 years ago

rcannood commented 5 years ago

Hello all!

I wanted to create a helper function for defining a 'subset' parameter. I had hoped to be able to define it as such:

makeSubsetParam <- function(id, values, default) {
  makeLogicalVectorParam(
    id = id, 
    len = length(values),
    default = values %in% default,
    trafo = carrier::crate(function(x) values[x], values = values)
  )
}
subset <- makeSubsetParam(id = "dimred", values = c("mds", "pca", "tsne"), default = c("pca", "mds"))

But a logical vector param does not have a trafo argument, it makes sense this would not work.

Instead, I was hoping to circumvent this problem by calling makeParam directly, however, when generating a design, using the trafo results in a warning:

makeSubsetParam <- function(id, values, default) {
  ParamHelpers:::makeParam(
    id = id,
    type = "logicalvector", 
    learner.param = FALSE, 
    len = length(values),
    values = values, 
    cnames = NULL, 
    default = values %in% default,
    trafo = carrier::crate(function(x) values[x], values = values),
    requires = NULL, 
    tunable = TRUE, 
    special.vals = list()
  )
}
subset <- makeSubsetParam(id = "dimred", values = c("mds", "pca", "tsne"), default = c("pca", "mds"))
parset <- makeParamSet(subset)
generateDesign(n = 10, par.set = parset, trafo = TRUE)
# Error in generateDesign(n = 10, par.set = parset, trafo = TRUE) : 
#   INTEGER() can only be applied to a 'integer', not a 'character'

I guess my issue is similar to #25. I tried to use a discretevector instead, but got the same error message:

makeSubsetParam <- function(id, values, default) {
  ParamHelpers:::makeParam(
    id = id,
    type = "discretevector", 
    learner.param = FALSE, 
    len = length(values),
    values = list("TRUE", "FALSE"), 
    cnames = NULL, 
    default = as.list(ifelse(values %in% default, "TRUE", "FALSE")),
    trafo = carrier::crate(function(x) values[unlist(x) == "TRUE"], values = values),
    requires = NULL, 
    tunable = TRUE, 
    special.vals = list()
  )
}
subset <- makeSubsetParam(id = "dimred", values = c("mds", "pca", "tsne"), default = c("pca", "mds"))
parset <- makeParamSet(subset)
generateDesign(n = 10, par.set = parset, trafo = TRUE)
# Error in generateDesign(n = 10, par.set = parset, trafo = TRUE) : 
#  INTEGER() can only be applied to a 'integer', not a 'character'

Would I be able to solve this problem using ParamHelpers 1.11? Or would I have to dig deeper into the ParamHelpers code in order to be able to find a solution?

Thanks, Robrecht

mb706 commented 5 years ago

It appears that generateDesign with trafo = TRUE only works with integer or numeric (vector) parameters. I don't know if this is a "bug" or documented somewhere. The workaround is to generateDesign(.., trafo = FALSE) and then use dfRow[s]ToList and trafoValue.

Another problem would be that DiscreteVectorParam values are list-typed. You'd have to change default and trafo accordingly. Finally, note that you don't have a list of logical, but a list of character. You'd have to index values not by unlist(x), but by unlist(x) == "TRUE".

rcannood commented 5 years ago

Thanks for the quick response!

I fixed the problems you mentioned in the second paragraph (lists instead of vectors, unlist, and == "TRUE").

I can confirm that the following works:

library(ParamHelpers)

makeSubsetParam <- function(id, values, default) {
  ParamHelpers:::makeParam(
    id = id,
    type = "discretevector", 
    learner.param = FALSE, 
    len = length(values),
    values = list("TRUE", "FALSE"), 
    cnames = NULL, 
    default = as.list(ifelse(values %in% default, "TRUE", "FALSE")),
    trafo = carrier::crate(function(x) values[unlist(x) == "TRUE"], values = values),
    requires = NULL, 
    tunable = TRUE, 
    special.vals = list()
  )
}
subset <- makeSubsetParam(id = "dimred", values = c("mds", "pca", "tsne"), default = c("pca", "mds"))
parset <- makeParamSet(subset)
generateDesign(n = 10, par.set = parset, trafo = FALSE)
dfRowToList(des, parset, 1) %>% trafoValue(parset, .)
# $dimred
# [1] "mds"  "tsne"

Alternatively, I could also use an integervector with lower = 0, upper = 1, default = as.integer(values %in% default) and trafo = function(x) values[x == 1L]. Which do you think is better?

I'm planning to use this as part of mlrMBO. I guess mlrMBO uses generateDesign(..., trafo = FALSE) and then later drRowToList and trafoValue? So the parameter defined above should work, right?

Thanks again, Robrecht

mb706 commented 5 years ago

I would use the DiscreteVector (or possibly a LogicalVector if you want, although I don't know how much changing the "type" of a parameter in the transformation is a problem) and not IntegerVector just for conceptual reasons. Using an integer vector will not make generateDesign(.., trafo = TRUE) work (since the transformation inside generateDesign does not allow a change of type or dimension during transformation), so there is really no upside to IntegerVector.

None of the tuning algorithms that I am aware of use generateDesign(trafo = TRUE), so for optimisation this should be fine.

rcannood commented 5 years ago

Thanks, my problem has been solved :)