mlr-org / mlr3learners

Recommended learners for mlr3
https://mlr3learners.mlr-org.com
GNU Lesser General Public License v3.0
89 stars 14 forks source link

give glmnet the $importance slot #28

Open mb706 opened 5 years ago

mb706 commented 5 years ago

because then it could be used in combination with FilterEmbedded in mlr3featsel for feature selection in order of L1 inclusion. Importance could be the (approximate) lambda value at which a feature is first included and can easily be calculated from the model.

mb706 commented 5 years ago

Some code I used for something similar (using "old" mlr). This only gets the order by which the features are introduced; I think the approximate lamba value would be more informative.

# orders features by in what order they are introduced when decreasing shrinkage in L1 regression.
slfun <- function(task, nselect, alpha = 1, ...) {
  xy <- getTaskData(task, target.extra = TRUE)
  if (getTaskType(task) == "regr") {
    family <- "gaussian"
  } else {
    family <- if (length(levels(xy$target)) > 2) "multinomial" else "binomial"
  }
  fit <- glmnet(x = as.matrix(xy$data), y = xy$target, alpha = alpha,
    lambda.min.ratio = 1e-4, family = family)
  captured <- integer(0)
  for (col in seq_len(ncol(fit$beta))) {
    curcols <- which(fit$beta[, col] != 0)
    newcols <- setdiff(curcols, captured)
    captured <- c(captured, newcols)
  }
  captured <- c(captured, setdiff(seq_len(getTaskNFeats(task)), captured))
  res <- -order(captured)
  names(res) <- getTaskFeatureNames(task)
  res
}
be-marc commented 4 years ago

@mb706 Like this https://github.com/mlr-org/mlr3learners/commit/7594ad7b3f5f281d4d6cef1cfd8d0119e9c52f7a ? Do you know how we can handle multi class tasks? We get for each target class a different beta matrix and the positions at which the features are introduced varies between them.