[x] maybe we should actually rename the test task to validation (?) But the naming is still cofusing as the resampling's test set then becomes the validation set ...
[ ] some more checks that verify that the holdout and validation task are compatible with the primary task. Pay attention to the different task types (e.g. don't check for target in clustering task).
This PR enables to solve the problem that the test rows, that can e.g. used for early stopping by xgboost, can be preprocessed in a graph learner and that early stopping xgboost in a graph learner now works.
Some explanations for the changes:
The relevant lines of code, that restricted how we can implement the preprocessing of test rows can be found here: https://github.com/mlr-org/mlr3pipelines/blob/044762e64e68c4aec39cd2e6b6e1f8ef45f135ca/R/PipeOpTaskPreproc.R#L211-L218. First, the private $.train_task(task) method modifies the 'use' rows of task in-place (usually by cbinding, but in principle, anything can happen here, and users have possibly overwritten this method when inheriting from PipeOpTaskPreproc.
After setting the state of the PipeOp, somehow the predictions must be made on the test rows, and added to the task. We previously explored row-binding them to the task, but this was inefficient, as row-binding requires to row-bind all columns, even if they were not altered by the pipeop. In a graph, this would introdcues a rbind-cbind-rbind-cbind, ..., rbind-cbind backend structure, which is a) hard to flatten and b) memory inefficient and can get possibly slow. The solution implemented in this Pull Request sidesteps this problem by simply adding the test task to the task itself, using the newly introduced AB $test_task. The test task can be conveniently created by the user, using the newly introduced $partition() method.
In practice, this now looks as follows:
<sup>Created on 2024-02-16 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>
* `PipeOp`s always preprocess the test_task when it is provided. However, a `GraphLearner` only wants to do the preprocessing on the test rows, when they are needed otherwise this is unnecessary computation (as they are currently not used for the learner's `$predict()` step. To communicate this, the 'uses_test_task' property was introduced.
Because the 'uses_test_task' property is not fixed (its presence depends e.g. on whether he `early_stopping_set` parameter from XGBoost is set to `"test"` or `"none"`), it was necessary to add the ability to dynamically generate a learner's properties. This was done using the private method `.contingent_properties()` that can be overwritten by learners. It is necessary to set this method in the `Learner` base class to a function returning `character(0)` (and not `NULL`), because of a bug in `R6`.
* Retired interface: We previously had the API `task$set_row_roles(1, "test")` or `task$set_row_roles(1, "holdout")`.
Because we now introduced the `$test_task` field, there would have been two ways to achieve something similar. This made code messy and the interface confusing. For this reason, both the `holdout` and `test` row-roles were removed.
Because this PR breaks some existing packages (because of the removal of the 'holdout' and 'test' row roles), I have already created Pull Requests in some packages:
* [x] TODO: check whether I really got all packages (only checked those that I have locally available)
The general plan to merge this feature is to:
1. Make releases for these PRs:
* `mlr3learners`: https://github.com/mlr-org/mlr3learners/pull/288 (Xgboost, only dev and paramtest are failing)
* `mlr3tuning`: https://github.com/mlr-org/mlr3tuning/pull/413 (holdout set is used)
* `mcboost`: https://github.com/mlr-org/mcboost/pull/44 (vignette uses holdout set)
* `mlr3fairness` https://github.com/mlr-org/mlr3fairness/pull/74 (there is a bug that I did not cause)
* `mlr3pipelines`https://github.com/mlr-org/mlr3pipelines/pull/761/files (this is needed, because of the way the graphlearner sets its properties)
2. Merge this branch and make a release on CRAN
3. Implement the feature in pipelines and make a release from this branch:
* https://github.com/mlr-org/mlr3pipelines/pull/760
4. Make changes in `mlr3extralearners` and bump mlr3 dependency
5. Make a gallery post about this
TODOs:
This PR enables to solve the problem that the test rows, that can e.g. used for early stopping by xgboost, can be preprocessed in a graph learner and that early stopping xgboost in a graph learner now works.
Some explanations for the changes:
$.train_task(task)
method modifies the 'use' rows of task in-place (usually by cbinding, but in principle, anything can happen here, and users have possibly overwritten this method when inheriting fromPipeOpTaskPreproc
. After setting the state of thePipeOp
, somehow the predictions must be made on the test rows, and added to the task. We previously explored row-binding them to the task, but this was inefficient, as row-binding requires to row-bind all columns, even if they were not altered by the pipeop. In a graph, this would introdcues a rbind-cbind-rbind-cbind, ..., rbind-cbind backend structure, which is a) hard to flatten and b) memory inefficient and can get possibly slow. The solution implemented in this Pull Request sidesteps this problem by simply adding the test task to the task itself, using the newly introduced AB$test_task
. The test task can be conveniently created by the user, using the newly introduced$partition()
method. In practice, this now looks as follows:task = tsk("iris") task
> (150 x 5): Iris Flowers
> * Target: Species
> * Properties: multiclass
> * Features (4):
> - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
task$divide(1:10, "test") task
> (140 x 5): Iris Flowers
> * Target: Species
> * Properties: multiclass
> * Features (4):
> - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
> * Test Task: (10x5)
task$test_task
> (10 x 5): Iris Flowers
> * Target: Species
> * Properties: multiclass
> * Features (4):
> - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
po_pca = po("pca")
taskout = po_pca$train(list(task))[[1L]] taskout$test_task
> (10 x 5): Iris Flowers
> * Target: Species
> * Properties: multiclass
> * Features (4):
> - dbl (4): PC1, PC2, PC3, PC4