mlr-org / mlr3

mlr3: Machine Learning in R - next generation
https://mlr3.mlr-org.com
GNU Lesser General Public License v3.0
947 stars 85 forks source link

task should allow for simple user defined characteristics #1168

Closed berndbischl closed 2 days ago

berndbischl commented 2 months ago

assume you run a simulation study. then often you create tasks with different properties.

either we can allow to add these properties to a benchmark_grid, and preserve them in the result. (currently this results in a weird error message which could also be improved) a better option would be to allow something like this

task$characteristics = list(n = 5, p = 7), or task$settings then we can easily later "unfold" this into the table of the BMR

be-marc commented 2 days ago

Done #1211

be-marc commented 2 days ago

This works now

library(mlr3)

tsk_1 = tsk("spam")
tsk_1$characteristics = list(n = 300)
tsk_2 = tsk("spam")
tsk_2$characteristics = list(n = 200)

learner = lrn("classif.rpart")
resampling = rsmp("cv", folds = 3)

design = benchmark_grid(
  task = list(tsk_1, tsk_2),
  learner = learner,
  resampling = resampling
)

bmr = benchmark(design)
as.data.table(bmr, task_characteristics = TRUE)[, list(iteration, task, learner, n)]

#    iteration               task                             learner     n
#        <int>             <list>                              <list> <num>
# 1:         1 <TaskClassif:spam> <LearnerClassifRpart:classif.rpart>   300
# 2:         2 <TaskClassif:spam> <LearnerClassifRpart:classif.rpart>   300
# 3:         3 <TaskClassif:spam> <LearnerClassifRpart:classif.rpart>   300
# 4:         1 <TaskClassif:spam> <LearnerClassifRpart:classif.rpart>   200
# 5:         2 <TaskClassif:spam> <LearnerClassifRpart:classif.rpart>   200
# 6:         3 <TaskClassif:spam> <LearnerClassifRpart:classif.rpart>   200