mlr-org / mlr

Machine Learning in R
https://mlr.mlr-org.com
Other
1.64k stars 405 forks source link

Using more than one batchmark() call can cause an error when using reducebatchmarkresults() #2490

Closed sycoforbidden closed 4 years ago

sycoforbidden commented 6 years ago

when one needs to make several tasks that should not be used by a specific learner (aka not every possible combination of learner/tasks), using reduceBatchmarkResults is fine, but getBMRAggrPerformances() does not work.

getBMRAggrPerformances(BM, as.df = TRUE)
#>Error in bmr$results[[tid]][[lid]] : 
#>  attempt to select less than one element in get1index

The following is the use case. I want to do a benchmark where these learners have to be changed so that the max ncomp for pls can be changed according to a task with reduced features.

batchtools::makeExperimentRegistry()

tmp <- LTuneParam[[k]]$pars$ncomp$upper - LTuneParam[[k]]$pars$ncomp$lower

lapply(1:tmp, function(m) {
  # Change the search parameter to max number of features
  tmppar <- LTuneParam[[k]]
  tmppar$pars$ncomp$upper <- tmppar$pars$ncomp$lower + m - 1
  tmplrn <- lrn[[k]]
  tmplrn$id <- paste0(tmplrn$id, "_", tmppar$pars$ncomp$upper)

  tmplrn <- makeTuneWrapper(learner = tmplrn,
                            resampling = resdesc.inner,
                            measures = resmeasure.inner,
                            par.set = tmppar,
                            control = tc[[k]],
                            show.info = FALSE)

  batchmark(learners = tmplrn,
            tasks = tmptasklist[[m]],
            resamplings = cv5,
            measures = rmse,
            models = FALSE)
})

batchmark(learners = lrnlist[[i]],
          tasks = tmptasklist[(tmp + 1):length(tmptasklist)],
          resamplings = cv5,
          measures = rmse,
          models = FALSE)

batchtools::submitJobs(resources = list(pm.backend = "multicore",
                                        ncpus = detectCores() - 1))
sycoforbidden commented 6 years ago

It seems the problem lies in the reducebatchmarkresults() function. It doesn't list all the learners used in the benchmark experiment, only the last one it seems. The results in BM$results have the correct naming convention.

EDIT: After testing it out, it seems like the BMR functions work only assuming that the learners and tasks are all combinations of each other, which may not be true with batchmark(). I don't know how to solve this.

BM$learners
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $<NA>
#>   NULL
#> 
#> $regr.plsr.tuned
#> Learner regr.plsr.tuned from package pls
#> Type: regr
#> Name: ; Short name: 
#>   Class: TuneWrapper
#> Properties: numerics,factors
#> Predict-Type: response
#> Hyperparameters: method=simpls
stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.