mlr-org / mlr3cluster

Cluster analysis for mlr3
https://mlr3cluster.mlr-org.com
GNU Lesser General Public License v3.0
21 stars 6 forks source link

make mlr3cluster usable for pipelines: #14

Open henrifnk opened 3 years ago

henrifnk commented 3 years ago

store task in the prediction and make measures rely on this task. This makes mlr3cluster usable for mlr3pipelines

Solves #13

giuseppec commented 3 years ago

We need to talk about whether we always want that a prediction object contains the task (since tasks can be rather large objects). But maybe we can improve this PR by

  1. We should first check if the learner / pipeline changed the task/data (because in the case that the learner/pipeline does change/preprocess the data, we do not need this fix here and don't have to force the prediction object to carry the potentially large task object). I think this is something we should definitively do.
  2. If the learner / pipeline changed the task, we need access to the changed data in the prediction object to calculate a measure. I see two options here: a. We store the task in the prediction object (like @henrifnk did it here) and maybe include a keep.task flag to deactivate this option just in case the task is too large (not sure if I am happy with this keep.task suggestion)??? b. Instead of storing the task in the prediction object, we could pass the learner to the $score function and let the learner transform the task inside the $score method again??? Advantage: We don't have to append a (potentially large) task object in the prediction object. Big disadvantage: $score will take long if the pipeop is not just scaling...