mlr-org / mlr3pipelines

Dataflow Programming for Machine Learning in R
https://mlr3pipelines.mlr-org.com/
GNU Lesser General Public License v3.0
132 stars 25 forks source link

pipeop$predict_newdata() functionality #731

Open mermast opened 12 months ago

mermast commented 12 months ago

Is there a way to transform a new data as per pipeline trained on a given task. For example:-

ames = mlr3data::ames_housing
ames1 = ames[1:2000,]
ames2 = ames[2001:nrow(ames),-c('Sale_Price')]
to_remove = c("Lot_Area_m2", "Condition_3", "Misc_Feature_2")
tsk_ames = as_task_regr(ames1, target = "Sale_Price", id = "ames")
# remove problematic features
tsk_ames$select(setdiff(tsk_ames$feature_names, to_remove))
po_encod = po("encode")
# to get the transformed data
po_encod$train(list(tsk_ames))[[1]]$data()
# this will through an error stating only tasks objects are allowed as input.
po_encod$predict(ames2)