mlr-org / mlr3

mlr3: Machine Learning in R - next generation
https://mlr3.mlr-org.com
GNU Lesser General Public License v3.0
927 stars 86 forks source link

selected_features for learners that don't support it should be the entirety of features seen in training #935

Open mb706 opened 1 year ago

mb706 commented 1 year ago

This way we could correctly query a pipeline that selects features first and gives the result to a learner. The GraphLearner could then ask the learner at the end how many features it used, and if it is a learner that supports embedded featsel (rpart e.g.) then this would give the correct value, but even for learners that do not do support it the result could make sense.

Also this would solve https://github.com/mlr-org/mlr3fselect/issues/87

be-marc commented 1 week ago
library(mlr3)
library(mlr3learners)

learner = lrn("classif.rpart")
task = tsk("spam")

learner$train(task)
learner$selected_features()

#> [1] "charDollar"      "hp"             
#> [3] "remove"          "charExclamation"
#> [5] "capitalTotal"    "free"  

learner = lrn("classif.log_reg")
learner$train(task)
learner$selected_features()
# > Error: attempt to apply non-function