Closed prubbens closed 5 years ago
I am currently using the imphistory variable (and added it as an attribute of the base Boruta opbject). I was wondering, is this the history of importances for every variable, or the difference in importance with it's shadow variables?
I'm also struggling with this. It would be awesome to have access to the z-scores
It seems that imp_history_variable is the feature importance of the classifier method (i.e. the random forest), for the real variable only and not the shadow one:
cur_imp is given by_add_shadows_get_imps which returns [imp_real, imp_sha] where imp correspond to estimator.featureimportances
then cur_imp[0]=imp_real is appended to imp_hist
I guess it can be possible to calculate the confidence intervals if you loop through the individual tree of the random forests as described here http://scikit-learn.org/stable/auto_examples/ensemble/plot_forest_importances.html
Is it possible to access the individual Z-scores of variables? Such as to make a visualization that has been done in Fig. 2 of the original paper.![image](https://user-images.githubusercontent.com/16348558/35229488-d61d8d66-ff93-11e7-8396-e1d5e62fe612.png)