FeatureImp returns same importance value for every variable with mlr3 BART model #202

SamuelFrederick commented 1 year ago

I am training several ML models using the mlr3 package and have been using iml to retrieve permutation importance for variables in my data. However, I have noticed that, for BART models, the variable importance is exactly the same for every variable. Below is code to reproduce this issue using a toy dataset. Even the variable which is completely unrelated to the outcome variable has the same variable importance as the others.

n <- 100
x1 <- rnorm(n, 4, 5)
x2 <- sample(c("a", "b","c"), size = n, replace = T)
x3 <- sample(letters[1:4], size = n, replace = T)
x4_noise <- rnorm(n, 1, 6)

y <- 3 + 2*x1 + 5*(x2=="a") - 10*(x2=="b") + 25*(x2=="c") +
  4*(x3=="a") - 4*(x3=="b") + 5*(x3=="c") +10*(x3=="d") - 
  50*(x3=="d")*(x2=="b") +
  rnorm(n, 0, 3)

df <- data.frame(x1 = x1, x2 = factor(x2), 
                 x3 = factor(x3), x4_noise = x4_noise, 
                 y = y)
task <- as_task_regr(df, target = "y")
gr <- po("scale") %>>% po("encode") %>>% lrn("regr.bart")
grl <- GraphLearner$new(gr)

model <- iml::Predictor$new(grl, data = df, y = "y")
imp_mod <- iml::FeatureImp$new(model, loss = "rmse", 
                               n.repetitions = 50, 
                               compare = "ratio") 


   feature importance.05 importance importance.95 permutation.error
1       x1      15.61669   15.61669      15.61669          24.96426
2       x2      15.61669   15.61669      15.61669          24.96426
3       x3      15.61669   15.61669      15.61669          24.96426
4 x4_noise      15.61669   15.61669      15.61669          24.96426
SamuelFrederick commented 1 year ago

One update: this seems to only be an issue when operating in parallel with the BART model. I do not obtain the same importance for all variables when using other models (e.g., Ranger, xgboost, GBM, etc.) in parallel or when using BART with sequential variable importance calculation. I have modified the code such that it will reproduce this issue below:

n <- 100
x1 <- rnorm(n, 4, 5)
x2 <- sample(c("a", "b","c"), size = n, replace = T)
x3 <- sample(letters[1:4], size = n, replace = T)
x4_noise <- rnorm(n, 1, 6)

y <- 3 + 2*x1 + 5*(x2=="a") - 10*(x2=="b") + 25*(x2=="c") +
  4*(x3=="a") - 4*(x3=="b") + 5*(x3=="c") +10*(x3=="d") - 
  50*(x3=="d")*(x2=="b") +
  rnorm(n, 0, 3)

df <- data.frame(x1 = x1, x2 = factor(x2), 
                 x3 = factor(x3), x4_noise = x4_noise, 
                 y = y)
task <- as_task_regr(df, target = "y")
gr <- po("scale") %>>% po("encode") %>>% lrn("regr.bart")
grl <- GraphLearner$new(gr)

future::plan("multisession", workers=2)
model <- iml::Predictor$new(grl, data = df, y = "y")
imp_mod <- iml::FeatureImp$new(model, loss = "rmse", 
                               n.repetitions = 50, 
                               compare = "ratio") 