Closed oli666 closed 5 years ago
Same issue here
Please, note: localImp has to be set as TRUE when building your forest, otherwise the explain_forest will quit with an error. I had the same error you mentioned, but setting localImp = TRUE when building the forest resolved it.
Running explain_forest on a model trained on a popular eductional data set (German Credit Data) throws the following error:
_Quitting from lines 81-82 (Explain_foresttemplate.Rmd) Error in
[.data.frame
(rankings, , measures) : undefined columns selectedCode to reproduce:
library(tidyverse) library(randomForest)
> Warning: Paket 'randomForest' wurde unter R Version 3.4.4 erstellt
> randomForest 4.6-14
> Type rfNews() to see new features/changes/bug fixes.
>
> Attache Paket: 'randomForest'
> The following object is masked from 'package:dplyr':
>
> combine
> The following object is masked from 'package:ggplot2':
>
> margin
library(randomForestExplainer) set.seed(123) credit <- read_csv('http://invidio.drl.pl/files/german_credit.csv')
> Parsed with column specification:
> cols(
> .default = col_character(),
> default = col_integer(),
> duration_in_month = col_integer(),
> credit_amount = col_integer(),
> installment_as_income_perc = col_integer(),
> present_res_since = col_integer(),
> age = col_integer(),
> credits_this_bank = col_integer(),
> people_under_maintenance = col_integer()
> )
> See spec(...) for full column specifications.
credit <- credit %>% mutate_if(is.character, as.factor) %>% mutate(default = as.factor(default))
> Warning: Paket 'bindrcpp' wurde unter R Version 3.4.4 erstellt
credit_shuffled <- sample_frac(credit, 1) n <- nrow(credit_shuffled) n_train <- round(0.8 * n) train_indices <- sample(1:n, n_train) credit_train <- credit_shuffled[train_indices,] credit_test <- credit_shuffled[-train_indices,]
glimpse(credit_train)
> Observations: 800
> Variables: 21
> $ default 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,...
> $ account_check_status 0 <= ... < 200 DM, no checking acco...
> $ duration_in_month 12, 21, 24, 24, 12, 18, 36, 48, 18,...
> $ credit_history critical account/ other credits exi...
> $ purpose car (new), business, radio/televisi...
> $ credit_amount 2366, 1572, 3777, 2197, 1412, 866, ...
> $ savings 500 <= ... < 1000 DM, .. >= 1000 DM...
> $ present_emp_since 4 <= ... < 7 years, .. >= 7 years, ...
> $ installment_as_income_perc 3, 4, 4, 4, 4, 4, 1, 4, 4, 4, 4, 1,...
> $ personal_status_sex male : divorced/separated, female :...
> $ other_debtors none, none, none, none, guarantor, ...
> $ present_res_since 3, 4, 4, 4, 2, 2, 3, 2, 1, 4, 1, 3,...
> $ property if not A121/A122 : car or other, no...
> $ age 36, 36, 50, 43, 29, 25, 31, 38, 43,...
> $ other_installment_plans none, bank, none, none, none, none,...
> $ housing own, own, own, own, own, own, own, ...
> $ credits_this_bank 1, 1, 1, 2, 2, 1, 2, 1, 1, 2, 1, 1,...
> $ job management/ self-employed/ highly q...
> $ people_under_maintenance 1, 1, 1, 2, 1, 1, 2, 1, 2, 1, 1, 1,...
> $ telephone yes, registered under the customers...
> $ foreign_worker yes, yes, yes, yes, yes, yes, yes, ...
credit_model <- randomForest( default ~ ., data = credit_train )
class(credit_model)
> [1] "randomForest.formula" "randomForest"
explain_forest(credit_model)
> processing file: Explain_forest_template.Rmd
> [1] accuracy_decrease and gini_decrease
> Quitting from lines 81-82 (Explain_forest_template.Rmd)
> Error in
[.data.frame
(rankings, , measures): nicht definierte Spalten gewähltCreated on 2018-07-12 by the reprex package (v0.2.0).