mlr-org / mlr3pipelines

Dataflow Programming for Machine Learning in R
https://mlr3pipelines.mlr-org.com/
GNU Lesser General Public License v3.0
140 stars 25 forks source link

Feature intersection #593

Open MislavSag opened 3 years ago

MislavSag commented 3 years ago

Hi,

It is easy to use PipeOpFeatureUnion to make union of features. But what if I want intersection of features instead of union?

For example, this code makes union of features generated from 2 filters:

graph = gunion(list(
    po("filter", mlr3filters::flt("jmim"), filter.nfeat = 3, param_vals = list(affect_columns = selector_type("numeric"))),
    po("filter", mlr3filters::flt("jmi"), filter.nfeat = 3, param_vals = list(affect_columns = selector_type("numeric")))
  )) %>>%
  po("featureunion", 2)
plot(graph)
graph$train(tsk("mtcars"))

But, what if I want intersection of important features and not union? The logic is that features that are important in multiple filters are "really important". In example above important features can be disp, drat and not disp, drat, hp, wt.

mb706 commented 3 years ago

features that are important in multiple filters are "really important"

That is an interesting idea, though I wonder whether the "correct" solution would rather be to have some kind of meta-filter object that combines multiple filters and creates some kind of consensus.

mb706 commented 3 years ago
FilterCombination = R6Class("FilterCombination", inherit = Filter,
  filters = NULL,
  initialize = function(filters) {
    super$initialize(id = "filtercombination", task_type = filters[[1]]$task_type,
      task_properties = Reduce(intersect, map(filters, "task_properties")),
      feature_types = Reduce(intersect, map(filters, "feature_types")),
      packages = unlist(map(filters, "packages"))
      self$filters = filters
    ),
  private = list(
    .calculate = function(task, nfeat) {
      scores = lapply(filters, function(f) f$calculate(task))
      # ...
  }
)
MislavSag commented 3 years ago

Maybe that's too complex for me right now. I would need to understand whole structure of Filter object and how is inherited. I thought I could use selector_intersect somehow, but couldn't find the way. I plan to learn mlr3 more deeply in the future. but for now I can't understand everything that is happening in above code.