crflynn / skranger

scikit-learn compatible Python bindings for ranger C++ random forest library
https://skranger.readthedocs.io/en/stable/
GNU General Public License v3.0
52 stars 7 forks source link

TreeSHAP #66

Closed alexisdrakopoulos closed 2 years ago

alexisdrakopoulos commented 3 years ago

Any idea if TreeSHAP could get reasonably implemented? I haven't seen an R version but it would be very useful

crflynn commented 3 years ago

Are you referring to https://github.com/ModelOriented/treeshap ? It looks like ranger is supported according to the readme there.

I found a python SHAP lib here: https://github.com/slundberg/shap. I'm guessing the implementation would have to be done in the shap lib rather than here, but I'm happy to support any required changes to make that possible.

salimamoukou commented 3 years ago

Hello @crflynn,

I maintain a library that computes SV values for tree-based models (ACV, https://github.com/salimamoukou/acv00), and I wanted to make skrangers compatible. However, I have a problem with the format of the "values" stored in each tree. In multi-class classification, what do the values of each node correspond to, and why are they always of dimension 1 (skranger/tree/_tree/values)? Note that for sklearn trees, the values of each node are usually of dimension # numberof class.

crflynn commented 2 years ago

@salimamoukou I merged a fix for this. Still not able to get SHAP working, though, I must be missing some detail(s) there. Will try to make a release this week.

crflynn commented 2 years ago

The latest version of skranger (0.7.0) should now work with shap and ACV (although generating tree details is still very slow): follow example here for shap: https://skranger.readthedocs.io/en/stable/tree_interface.html#shap