imbs-hl / ranger

A Fast Implementation of Random Forests
http://imbs-hl.github.io/ranger/
772 stars 194 forks source link

Feature "importance" #528

Closed lnicola closed 4 years ago

lnicola commented 4 years ago

Sorry for using the issue tracking for a support question, but it might be useful for other users, too.

Is it possible to get a measure of how "important" each feature in a model is? That is, if my dataset has 500 features and 50 of them are either noise or highly correlated with others, can I determine which?

Droelf-source commented 4 years ago

There is the option "importance" in a ranger call which you can use to get a measure for this. Currently implemented with 4 options I think. This give you back which feature brings the greatest improvements in impurity when option is chosen with "impurity" or "impurity_corrected". You can't really tell about the correlation part though.

lnicola commented 4 years ago

Thanks, I misread the docs. The importance values are available as variable.importance.