mlr-org / mlr3

mlr3: Machine Learning in R - next generation
https://mlr3.mlr-org.com
GNU Lesser General Public License v3.0
935 stars 85 forks source link

ResampleResult and BenchmarkResult's `$score()` behave surprisingly when passing a `predict_set` #1006

Closed sebffischer closed 1 month ago

sebffischer commented 7 months ago

Example for the Benchmark Result is given below. In both cases, the argument predict_set is just not taken into account when scoring the measures. The problem are these lines:

Also, the $aggregate() method of both classes is missing the predict_set argument.

library(mlr3)
learner = lrn("regr.debug")
learner$predict_sets = c("test", "holdout")
task = tsk("mtcars")
row = task$data(1)
row$..row_id = 1000
row$mpg = 10000000
task$rbind(row)
task$set_row_roles(1000, "holdout")
bmr = benchmark(benchmark_grid(task, learner, rsmp("holdout")))
#> INFO  [11:11:10.706] [mlr3] Running benchmark with 1 resampling iterations
#> INFO  [11:11:10.740] [mlr3] Applying learner 'regr.debug' on task 'mtcars' (iter 1/1)
#> INFO  [11:11:10.753] [mlr3] Finished benchmark

score = bmr$score(msr("regr.mse"), predict_sets = "holdout")
(score$prediction[[1]]$truth - score$prediction[[1]]$response)^2
#> [1] 9.999962e+13
score$regr.mse
#> [1] 53.05924

Created on 2024-02-16 with reprex v2.0.2

sebffischer commented 1 month ago

other people are also confused: https://github.com/mlr-org/mlr3/issues/951