Closed hududed closed 4 months ago
@hududed per default lrn("regr.ranger"
) will estimate standard errors via the infinitesimal jacknife (se.method = "infjack"
) this can cause problems in the case of predicting only few (i.e., a single) point due to no correction being applicable (see, ?predict.ranger
). As OptimizerFocusSearch
might evaluate points in very small batches during acquisition function optimization this can cause errors.
Ideally you use the default random forest mlr3mbo
provides default_rf()
which specifies ranger to use the jackknife-after-bootstrap method for the estimation of standard errors.
library(mlr3mbo)
library(mlr3)
library(mlr3learners)
library(bbotk)
library(data.table)
library(tibble)
data = data.table(
Power = c(45, 14, 66, 12, 23, 40, 56, 64, 48),
Speed = c(49, 33, 30, 22, 46, 15, 20, 25, 12),
DPI = c(5, 5, 7, 3, 2, 5, 6, 3, 5),
N2gas = c(1, 0, 0, 1, 0, 1, 0, 0, 1),
Defocus = c(-0.2, 0.2, 0.1, 0.1, -0.1, 0.1, 0, 0, 0.2),
Resistance = c(5000000, 5000000, 5000000, 5000000, 5000000, 12.1, 3.7, 13.9, 4.6)
)
domain = ps(Power = p_int(lower = 10, upper = 70),
Speed = p_int(lower = 10, upper = 60),
DPI = p_int(lower = 1, upper = 7),
N2gas = p_int(lower = 0, upper = 1),
Defocus = p_dbl(lower = -0.3, upper = 0.3))
codomain = ps(Resistance = p_dbl(tags = "minimize"))
archive = Archive$new(search_space = domain, codomain = codomain)
archive$add_evals(xdt = data[, c("Power", "Speed", "DPI", "N2gas", "Defocus")], ydt = data[, c("Resistance")])
###
surrogate = srlrn(default_rf(), archive = archive)
###
acq_function = acqf("ei", surrogate = surrogate)
acq_optimizer = acqo(
opt("focus_search", n_points = 1000, maxit = 10),
terminator = trm("evals", n_evals = 11000),
acq_function = acq_function)
set.seed(42)
acq_function$surrogate$update()
acq_function$update()
candidate = acq_optimizer$optimize()
> candidate
Power Speed DPI N2gas Defocus x_domain acq_ei .already_evaluated
<int> <int> <int> <int> <num> <list> <num> <lgcl>
1: 52 21 5 0 -0.04136798 <list[5]> 165035.6 FALSE
in your case of a standard numeric / integer search space you might also want to use a GP as surrogate (default_gp()
).
Hope this helps.
Ah ok that solved it thanks. Is default_gp
and default_rf
the preferred learners? I didnt see any more than those two.
This a currently the two default learners used as surrogates, see ?default_surrogate
:
For numeric-only (including integers) parameter spaces without any
dependencies a Gaussian Process is constricted via ‘default_gp()’.
For mixed numeric-categorical parameter spaces, or spaces with
conditional parameters a random forest is constructed via
‘default_rf()’.
But this also might be extended in the future!
So when I run this code:
I get this error:
I am running mlr3mbo 0.2.2.