giuseppec / iml

iml: interpretable machine learning R package
https://giuseppec.github.io/iml/
Other
492 stars 87 forks source link

Bugfix w.r.t. issue #185 #189

Closed g-rho closed 2 years ago

g-rho commented 2 years ago

Hi Christoph,

the PR hopefully fixes issue #185 (which results from issue no.12 in github.com/LamaTe/mlr3shiny/). The problem ist that ordered factors have two classes (ordered and factor) which results in an error when the function get.feature.type() is called within get.grid() (file utils.R).

It would be great if you could check and merge the PR such that we can close the related issue in our package. Gero

christophM commented 2 years ago

Thanks! Could you add a test that shows that it now works?

pat-s commented 2 years ago

The underlying problem here is that when an ordered factor is passed, class() returns two values "ordered", "factor" which in turn breaks some future length comparison.

Usually ones tries to avoid hardcoding such a [1] subsets in packages. Yet in this case it will solve the issue and avoid additional code refactoring.

Here's a reprex that uses the fix from @g-rho

library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
library(iml)
# using pre-defined mlr3-tasks
task <- tsk("german_credit")

# create initial graph
graph <- Graph$new()
# adding learner PipeOp
graph$add_pipeop(lrn("classif.log_reg", predict_type = "prob"))

# graph <- graph %>>% po("threshold")
# graph$param_set$values$threshold.thresholds <- 0.5
# saving the graph as a GraphLearner
learner <- as_learner(graph)

# creating split for test and training data
# using default 80/20 split
train_ids <- sample(task$row_ids, task$nrow * 0.8)
test_ids <- setdiff(task$row_ids, train_ids)
# training the model
learner$train(task, row_ids = train_ids)
# predicting on the training and test data
train_pred <- learner$predict(task, row_ids = train_ids)
test_pred <- learner$predict(task, row_ids = test_ids)

# using iml for explaining
model <- Predictor$new(learner, data = task$data(), y = task$target_names)
feat_imp <- FeatureImp$new(model, loss = "ce", compare = "ratio")

# this one failed initially but works now
feat_effect <- FeatureEffects$new(model, method = "pdp")

Created on 2022-04-25 by the reprex package (v2.0.1)

pat-s commented 2 years ago

@christophM Could you merge main? (I can't myself, no perm)

You could also enable a setting the in repo settings to display a prompt to update the branch state.

pat-s commented 2 years ago

FWIW I actually meant to merge the main branch into this one so we can see the CI checks :) But it seems they are happy in the main branch already, so all good.