mlr-org / mlr3pipelines

Dataflow Programming for Machine Learning in R
https://mlr3pipelines.mlr-org.com/
GNU Lesser General Public License v3.0
132 stars 25 forks source link

start working on validation support #753

Closed sebffischer closed 6 months ago

sebffischer commented 6 months ago

PipeOps to look at:

sebffischer commented 6 months ago

This does not work because of the same reason bootstrapping does not work with mlr3pipelines. Consider the case where a task has overlapping "use" and "test" rows, e.g. task$row_roles$use of c(1, 2) and task$row_roles$test and this task should be preprocessed by PipeOpTaskPreproc.

We want to treat the row 2 during train and 2 during predict differently, as we want to apply the training preprocessing to the former and the prediction preprocessing to the latter. This does not work, because the tasks's data backend has only 3 rows and we want to cbind a backend that has 4 rows.

mb706 commented 6 months ago

https://github.com/mlr-org/mlr3pipelines/issues/706