sorhawell / forestFloor

R package to visualize mapping structures of random forests with feature contributions
http://forestFloor.dk
GNU General Public License v2.0
42 stars 6 forks source link

Support to XgBoost? #25

Open BTessoulin opened 7 years ago

BTessoulin commented 7 years ago

Hi, many thanks for your very useful, nicely built and intuitive tool, really. You've said somewhere that maybe forestFloor may handle boosted trees from xgboost, do you have any news regarding this assertion?

By the way, rf is already really powerful for my needs but xgboost is a very effective companion regarding computation time....

Have a nice day!

Ben

sorhawell commented 7 years ago

Hi glad you liked it :)

Feature contributions for boosted trees, Short story: yes it is possible! but unfortunately it is not so desirable.

To understand the learned structure of a non-linear multivariate models can difficult. However often the model can be broken down into separate components, that are easy to understand. So the feature contribution is just one useful way to decompose a decision forest model. With forestFloor I argue FCs are very useful to break down a decision forest into the relevant additive effects and possible low order interaction effects. The FC decomposition is useful for random forest models because it tend to split the model into lowest order effects. Why describe something as two interactions cancelling out each other, if the model structure could be described as two main effects. The feature contribution decomposition can be extended to boosted trees, see section 1.5 forest floor visualizations of gradient boosted trees. However I found the decomposition not as useful for gradient boosted trees, because it tends not to decompose the model structure into lowest order effects. Notice e.g. in pdf for variable X2, that it is not decomposed into an additive effect, but in fact due interaction with X1. Maybe it would be possible to come up with some modified decomposition method for gradient boosting, but I don't have any answer to that now.

BTessoulin commented 7 years ago

Thank you for this detailled explanation, I clearly see what you mean! I'll continue with your decomposition for RF and GBM in an other hand!