Terminator stagnation and an occasionally failing learner return an error during tuning

missuse commented 4 years ago

Hi,

I stumbled on this by chance, not sure if it can be classified as bug but I thought you should know about it.

When combined with a learner that fails occasionally stagnation terminator returns the error:

Error in if (self$terminator$is_terminated(self)) { : 
  missing value where TRUE/FALSE needed

Example:

lrn_rpart <- lrn("classif.rpart")
ig <- po("filter", flt("information_gain"))

ps <- ParamSet$new(list(
  ParamDbl$new("classif.rpart.cp", lower = 0, upper = 0.05),
  ParamInt$new("information_gain.filter.nfeat", lower = 20L, upper = 60L),
  ParamFct$new("information_gain.type", levels = c("infogain",
                                                   "gainratio")) # I know gainratio does not work well with Sonar
))

glrn <- ig %>>%
  lrn_rpart

glrn <- GraphLearner$new(glrn) 

glrn$encapsulate <-  c(train = "evaluate", predict = "evaluate")

cv5 <- rsmp("cv", folds = 5)

tsk <- mlr_tasks$get("sonar")

instance <- TuningInstance$new(
  task = tsk,
  learner = glrn,
  resampling = cv5,
  measures = msr("classif.ce"),
  param_set = ps,
  terminator =  term("stagnation", iters = 5, threshold = 0)
)

tuner <- TunerRandomSearch$new()
set.seed(123)
tuner$tune(instance)

After 6 configurations evaluated the error occurs - I trust it is due to the NaN in the performance measure

instance$archive()
   nr batch_nr  resample_result task_id                     learner_id resampling_id iters params tune_x warnings errors classif.ce
1:  1        1 <ResampleResult>   sonar information_gain.classif.rpart            cv     5 <list> <list>        0      0  0.2648084
2:  2        2 <ResampleResult>   sonar information_gain.classif.rpart            cv     5 <list> <list>        0      0  0.2596980
3:  3        3 <ResampleResult>   sonar information_gain.classif.rpart            cv     5 <list> <list>        0      5        NaN
4:  4        4 <ResampleResult>   sonar information_gain.classif.rpart            cv     5 <list> <list>        0      0  0.2454123
5:  5        5 <ResampleResult>   sonar information_gain.classif.rpart            cv     5 <list> <list>        0      0  0.2501742
6:  6        6 <ResampleResult>   sonar information_gain.classif.rpart            cv     5 <list> <list>        0      0  0.2737515

All the best,

Milan

jakob-r commented 4 years ago

As a quick solution you could use a fallback learner in order to avoid NaN/NAs in the OptimInstance$archive. See the section on fallback learners here: https://mlr3book.mlr-org.com/error-handling.html

Apart from that this is definitely a bug. Generally we should test against objectives that return NAs or decide how to deal with them.

be-marc commented 4 years ago

We decided that objectives that return NAs throw an error.

missuse commented 4 years ago

Perhaps it would be worth considering giving the option of na.rm = TRUE with terminator stagnation while the default behavior would be an error.

berndbischl commented 4 years ago

Perhaps it would be worth considering giving the option of na.rm = TRUE with terminator stagnation while the default behavior would be an error.

no, you misunderstand. bbotk from now on DISALLOWS completely that NAs are returned in the evaluation. you get an error if that happens. and tuning will stop, hard. but the good news is: simply use the fallback learner in mlr3. that's mlr3's canonical answer to that problem. that ENSURE that you always generate a non-NA performance value. that allows us to keep concepts and API simple, as here.

missuse commented 4 years ago

Perhaps it would be worth considering giving the option of na.rm = TRUE with terminator stagnation while the default behavior would be an error.

no, you misunderstand. bbotk from now on DISALLOWS completely that NAs are returned in the evaluation. you get an error if that happens. and tuning will stop, hard. but the good news is: simply use the fallback learner in mlr3. that's mlr3's canonical answer to that problem. that ENSURE that you always generate a non-NA performance value. that allows us to keep concepts and API simple, as here.

What is the point of encapsulation then?

Anyhow thanks for the explanation I already started using fallback learners a while ago.

mlr-org / bbotk

Terminator stagnation and an occasionally failing learner return an error during tuning #57