giuseppec / iml

iml: interpretable machine learning R package
https://giuseppec.github.io/iml/
Other
491 stars 88 forks source link

when I try method='ranger' in caret::train, FeatureImp occurs error #17

Closed jeonghyunwoo closed 6 years ago

jeonghyunwoo commented 6 years ago

In caret::train, when method='ranger', FeatureImp occurs error message as belows:

library(tidyverse)
library(recipes)
library(rsample)
library(caret)
library(iml)
# data preparation ----
crd<- credit_data %>% rename_all(tolower)
spl<- initial_split(crd, prop=1/4)
tr<-training(spl)
tr<-recipe(status~.,data=tr) %>%
    step_meanimpute(all_numeric()) %>%
    step_modeimpute(all_nominal()) %>%
    step_center(all_numeric()) %>%
    step_scale(all_numeric()) %>%
    step_dummy(all_nominal(),-status) %>%
    prep(crd,retain=T) %>%
    juice()
# training----
rf1 <- train(status~., data=tr,method='rf',ntree=100,
                trControl=trainControl(method='cv',number=3))
rf2 <- train(status~., data=tr,method='ranger',num.trees=100,
                importance = 'impurity',
                trControl=trainControl(method='cv',number=3))
# Predictor ----
mod1 = Predictor$new(rf1, data=select(tr,-status), y=tr$status)
mod2 = Predictor$new(rf2, data=select(tr,-status), y=tr$status)
# FeatureImp----
imp1 = FeatureImp$new(mod1, loss='ce') # ok
imp2 = FeatureImp$new(mod2, loss='ce') # error occurs
# Error message in imp2----
Error in `[.data.frame`(out,  , obsLevels, drop = FALSE) : 
  undefined columns selected
christophM commented 6 years ago

Thanks for reporting.

I had a first look and it seems that the problem is that predict.ranger does not work with type = "prob". The iml package works with probabilities of classifiers, because most methods need some numerical output, like partial dependence plots, feature interaction and so on.

Feature importance is an exception because it would also work with factors (but currently doesn't). The same for TreeSurrogate. I have to think about how to resolve this.

jeonghyunwoo commented 6 years ago

Though, thank you for making such a creative and useful package. good luck to you

2018-05-16 16:18 GMT+09:00 Christoph Molnar notifications@github.com:

Thanks for reporting.

I had a first look and it seems that the problem is that predict.ranger does not work with type = "prob". The iml package works with probabilities of classifiers, because most methods need some numerical output, like partial dependence plots, feature interaction and so on.

Feature importance is an exception because it would also work with factors (but currently doesn't). The same for TreeSurrogate. I have to think about how to resolve this.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/christophM/iml/issues/17#issuecomment-389419966, or mute the thread https://github.com/notifications/unsubscribe-auth/AQDKZZyD-diJXn06AD9mcOJRdArvFt5bks5ty9LVgaJpZM4UAiTR .

christophM commented 6 years ago

Should work now with the latest version on Github:

devtools::install_github("christophM/iml")

Vicent-Ribas commented 8 months ago

Hi! Thank you for this very good package. I am having problems running the feature interaction function. I am also modeling with random forest using the "ranger" package. My predictor is type="response" and it works for feature importance, but not for feature interactions, it throws out the following error:

Error in h.test(f, j, no.j) : Assertion on 'f.all' failed: Contains missing values (element 1). In addition: There were 50 or more warnings (use warnings() to see the first 50) 1: In mean.default(.y.hat) : argument is not numeric or logical: returning NA

Is this normal? I have make it work for a regression random forest, but now I am facing this for a classification random forest. Thank you!