Closed sycoforbidden closed 4 years ago
Could you please provide a reprex, preferably with the reprex
pkg?
I don't know how to do it, is there a guide? I also never made a package, but since there are no dependencies on the reduceBatchmarkResults(), it should work independently. Any help would be appreciated.
I just did the reprex, you can copy it from below. The last 3 lines are the important results.
EDIT: Making a better reproducible example.
I am asking for a reproducible example. You just pasted (a lot of) function code.
The reprex may help you.
I am asking for a reproducible example. You just pasted (a lot of) function code.
The reprex may help you.
I think that should be it. The only change from one method to the other is the environment of the learner call, i don't know how important that can be, and it can be due to the code part where I change the function definition:
tmpfun <- get("reduceBatchmarkResults", envir = asNamespace("mlr"))
environment(newfun) <- environment(tmpfun)
attributes(newfun) <- attributes(tmpfun)
assignInNamespace("reduceBatchmarkResults", newfun, ns = "mlr")
library(pls)
library(batchtools)
library(mlr)
# ------------------------------------
# Proposed new function definition:
newfun <- function (ids = NULL, keep.pred = TRUE, models = TRUE, show.info = getMlrOption("show.info"), reg = batchtools::getDefaultRegistry()) { ... }
# ------------------------------------
# Start a new simple experiment.
tsk <- makeRegrTask(data = mtcars, target = "mpg")
lrn <- list(makeLearner("regr.plsr"),
makeLearner("regr.nnet"))
bm <- list()
# ------------------------------------
# Original models = TRUE
makeExperimentRegistry(file.dir = "BM_TRUE", seed = 1234)
batchmark(learners = lrn,
tasks = tsk,
resamplings = cv3,
measures = rmse,
models = TRUE)
submitJobs()
bm[[1]] <- reduceBatchmarkResults(show.info = FALSE)
getBMRModels(bm[[1]])
#> $mtcars
#> $mtcars$regr.plsr
#> $mtcars$regr.plsr[[1]]
#> Model for learner.id=regr.plsr; learner.class=regr.plsr
#> Trained on: task.id = mtcars; obs = 21; features = 10
#> Hyperparameters:
#>
#> $mtcars$regr.plsr[[2]]
#> Model for learner.id=regr.plsr; learner.class=regr.plsr
#> Trained on: task.id = mtcars; obs = 21; features = 10
#> Hyperparameters:
#>
#> $mtcars$regr.plsr[[3]]
#> Model for learner.id=regr.plsr; learner.class=regr.plsr
#> Trained on: task.id = mtcars; obs = 22; features = 10
#> Hyperparameters:
#>
#>
#> $mtcars$regr.nnet
#> $mtcars$regr.nnet[[1]]
#> Model for learner.id=regr.nnet; learner.class=regr.nnet
#> Trained on: task.id = mtcars; obs = 21; features = 10
#> Hyperparameters: size=3
#>
#> $mtcars$regr.nnet[[2]]
#> Model for learner.id=regr.nnet; learner.class=regr.nnet
#> Trained on: task.id = mtcars; obs = 21; features = 10
#> Hyperparameters: size=3
#>
#> $mtcars$regr.nnet[[3]]
#> Model for learner.id=regr.nnet; learner.class=regr.nnet
#> Trained on: task.id = mtcars; obs = 22; features = 10
#> Hyperparameters: size=3
# ------------------------------------
# Original models = FALSE
makeExperimentRegistry(file.dir = "BM_FALSE", seed = 1234)
batchmark(learners = lrn,
tasks = tsk,
resamplings = cv3,
measures = rmse,
models = FALSE)
submitJobs()
bm[[2]] <- reduceBatchmarkResults(show.info = FALSE)
getBMRModels(bm[[2]])
#> $mtcars
#> $mtcars$regr.plsr
#> NULL
#>
#> $mtcars$regr.nnet
#> NULL
# ------------------------------------
# Change the function definition to the new proposed function.
tmpfun <- get("reduceBatchmarkResults", envir = asNamespace("mlr"))
environment(newfun) <- environment(tmpfun)
attributes(newfun) <- attributes(tmpfun)
assignInNamespace("reduceBatchmarkResults", newfun, ns = "mlr")
# ------------------------------------
# Test for TRUE, models = TRUE or FALSE
batchtools::loadRegistry("BM_TRUE", writeable = TRUE)
bm[[3]] <- mlr::reduceBatchmarkResults(models = TRUE, show.info = FALSE)
bm[[4]] <- mlr::reduceBatchmarkResults(models = FALSE, show.info = FALSE)
removeRegistry(wait = 0)
# ------------------------------------
# Test for FALSE, models = TRUE or FALSE, should give a warning for bm[[5]].
batchtools::loadRegistry("BM_FALSE", writeable = TRUE)
bm[[5]] <- mlr::reduceBatchmarkResults(models = TRUE, show.info = FALSE)
#> Warning in mlr::reduceBatchmarkResults(models = TRUE, show.info = FALSE):
#> models = TRUE but there are no models in BenchmarkResult
#> Warning in mlr::reduceBatchmarkResults(models = TRUE, show.info = FALSE):
#> models = TRUE but there are no models in BenchmarkResult
bm[[6]] <- mlr::reduceBatchmarkResults(models = FALSE, show.info = FALSE)
removeRegistry(wait = 0)
# ------------------------------------
# Results 1 and 3 should be identical, but they aren't due to a change in environment
identical(bm[[1]], bm[[3]])
#> [1] FALSE
attributes(bm[[1]]$results$mtcars$regr.nnet[[c(8, 1, 2, 16)]])$.Environment
#> <environment: 0x562723244e40>
attributes(bm[[3]]$results$mtcars$regr.nnet[[c(8, 1, 2, 16)]])$.Environment
#> <environment: 0x5627284f3d38>
identical(bm[[1]]$measures, bm[[3]]$measures)
#> [1] TRUE
identical(bm[[1]]$learners, bm[[3]]$learners)
#> [1] TRUE
# ------------------------------------
# The rest is identical:
identical(bm[[2]], bm[[6]])
#> [1] TRUE
identical(bm[[2]], bm[[5]])
#> [1] TRUE
# The result from the sapply below should be TRUE, FALSE, TRUE, FALSE, FALSE, FALSE.
mdl <- lapply(bm, getBMRModels)
sapply(1:length(mdl), function(i) {!is.null(mdl[[c(i,1,1)]])})
#> [1] TRUE FALSE TRUE FALSE FALSE FALSE
detach("package:mlr", unload = TRUE)
detach("package:batchtools", unload = TRUE)
detach("package:pls", unload = TRUE)
reduceBatchmarkResults <- function (ids = NULL, keep.pred = TRUE, models = TRUE, show.info = getMlrOption("show.info"),
reg = batchtools::getDefaultRegistry())
{
requirePackages("batchtools", why = "batchmark", default.method = "load")
assertFlag(keep.pred)
assertClass(reg, "ExperimentRegistry")
if (is.null(ids))
ids = batchtools::findDone(reg = reg)
if (NROW(ids) != nrow(batchtools::findExperiments(reg = reg)))
warning("Collecting results for a subset of jobs. The resulting BenchmarkResult may be misleading.")
problem = algorithm = NULL
tab = batchtools::getJobPars(ids, reg = reg)[, c("job.id",
"problem", "algorithm")]
setkeyv(tab, cols = c("problem", "algorithm"), physical = FALSE)
result = namedList(tab[, unique(problem)])
for (prob in names(result)) {
algos = unique(tab[problem == prob], by = "algorithm")
data = batchtools::makeJob(id = algos$job.id[1L], reg = reg)$problem$data
result[[prob]] = namedList(algos$algorithm)
for (algo in names(result[[prob]])) {
res = batchtools::reduceResultsList(tab[problem ==
prob & algorithm == algo], reg = reg)
if (is.null(res[[1L]]$model) & models)
warning("models = TRUE but there are no models in BenchmarkResult")
.models = !is.null(res[[1L]]$model) & models
lrn = data$learner[[algo]]
extract.this = getExtractor(lrn)
rs = mergeResampleResult(learner.id = algo, task = data$task,
iter.results = res, measures = data$measures,
rin = data$rin, keep.pred = keep.pred, models = .models,
show.info = show.info, runtime = NA, extract = extract.this)
rs$learner = lrn
result[[prob]][[algo]] = addClasses(rs, "ResampleResult")
}
}
makeS3Obj(classes = "BenchmarkResult", results = result,
measures = data$measures, learners = data$learner[as.character(tab[,
unique(algorithm)])])
}
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Given a set of results in registry created with models, you can't load the benchmark result without the models. This could just require a small change in the underlying reduceBatchmarkResults() function by adding a .models = TRUE or .models = FALSE parameter.
since the .models parameter is given at head, even without changing any names the alternative would theoretically work. In the case where .models = TRUE but there is no model in the registry, one could maybe pass an error?
EDIT: use .models internally, models externally.