Open ecorreig opened 3 years ago
The function machine_learn outputs models with wrong reference levels for factors. This code
library(dplyr) library(healthcareai) sino <- c("No", "Sí") df <- tibble( x = sample(sino, 100, replace = T), y = sample(sino, 100, replace = T), z = sample(sino, 100, replace = T), a = 1:100 ) %>% mutate( across(c(x, y), function(x) factor(x, ordered = T)), z = as.factor(z) ) mod <- machine_learn(df, outcome = z, models = "rf") get_variable_importance(mod) %>% plot()
gives me:
sessionInfo():
sessionInfo()
R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale: [1] LC_COLLATE=Catalan_Spain.1252 LC_CTYPE=Catalan_Spain.1252 LC_MONETARY=Catalan_Spain.1252 [4] LC_NUMERIC=C LC_TIME=Catalan_Spain.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] forcats_0.5.0 caret_6.0-86 lattice_0.20-41 ggplot2_3.3.2 cvAUC_1.1.0 [6] data.table_1.13.0 ROCR_1.0-11 healthcareai_2.5.0 compareGroups_4.4.5 missForest_1.4 [11] itertools_0.1-3 iterators_1.0.12 foreach_1.5.0 randomForest_4.6-14 dplyr_1.0.2
The function machine_learn outputs models with wrong reference levels for factors. This code
gives me:
sessionInfo()
:R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale: [1] LC_COLLATE=Catalan_Spain.1252 LC_CTYPE=Catalan_Spain.1252 LC_MONETARY=Catalan_Spain.1252 [4] LC_NUMERIC=C LC_TIME=Catalan_Spain.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] forcats_0.5.0 caret_6.0-86 lattice_0.20-41 ggplot2_3.3.2 cvAUC_1.1.0
[6] data.table_1.13.0 ROCR_1.0-11 healthcareai_2.5.0 compareGroups_4.4.5 missForest_1.4
[11] itertools_0.1-3 iterators_1.0.12 foreach_1.5.0 randomForest_4.6-14 dplyr_1.0.2