parrt / dtreeviz

A python library for decision tree visualization and model interpretation.
MIT License
2.89k stars 332 forks source link

VisualisationNotYetSupportedError: get_min_samples_leaf() is not implemented yet for XGBoost. #303

Open SPDA36 opened 11 months ago

SPDA36 commented 11 months ago

As requested by the VisualisationNotYetSupportedError

---------------------------------------------------------------------------
VisualisationNotYetSupportedError         Traceback (most recent call last)
Cell In[298], line 6
      3 features = X_train_rfe.columns
      5 for i,ax in enumerate(axs.flatten()):
----> 6     tree_viz.rtree_feature_space(features=[features[i]], ax=ax)

File ~\miniconda3\envs\site-packages\dtreeviz\trees.py:1047, in DTreeVizAPI.rtree_feature_space(self, fontsize, ticks_fontsize, show, split_linewidth, mean_linewidth, markersize, colors, fontname, n_colors_in_map, features, figsize, ax)
   1045     features = self.shadow_tree.feature_names[0:min(n_features,2)] # pick first one/two features if none given
   1046 if len(features) == 1:  # univar example
-> 1047     _rtreeviz_univar(self.shadow_tree, fontsize, ticks_fontsize, fontname, show, split_linewidth, mean_linewidth, markersize, colors,
   1048                      features[0], figsize, ax)
   1049 elif len(features) == 2:  # bivar example
   1050     _rtreeviz_bivar_heatmap(self.shadow_tree, fontsize, ticks_fontsize, fontname, show, n_colors_in_map, colors,
   1051                             markersize, features, figsize, ax)
File ~\miniconda3\envs\\Lib\site-packages\dtreeviz\trees.py:1766, in _rtreeviz_univar(shadow_tree, fontsize, ticks_fontsize, fontname, show, split_linewidth, mean_linewidth, markersize, colors, feature, figsize, ax)
   1763 _format_axes(ax, shadow_tree.feature_names[featidx], shadow_tree.target_name, colors, fontsize, fontname, ticks_fontsize=ticks_fontsize, grid=False)
   1765 if 'title' in show:
-> 1766     title = f"Regression Tree Depth {shadow_tree.get_max_depth()}, Samples per Leaf {shadow_tree.get_min_samples_leaf()},\nTraining $R^2$={shadow_tree.get_score()}"
   1767     ax.set_title(title, fontsize=fontsize, color=colors['title'])
File ~\miniconda3\envs\Lib\site-packages\dtreeviz\models\xgb_decision_tree.py:242, in ShadowXGBDTree.get_min_samples_leaf(self)
    241 def get_min_samples_leaf(self):
--> 242     raise VisualisationNotYetSupportedError("get_min_samples_leaf()", "XGBoost")
VisualisationNotYetSupportedError: get_min_samples_leaf() is not implemented yet for XGBoost. Please create an issue on https://github.com/parrt/dtreeviz/issues if you need this. Thanks!
tlapusan commented 11 months ago

@SPDA36, I noted it and will try to implement in the next days, thanks !

SPDA36 commented 11 months ago

@SPDA36, I noted it and will try to implement in the next days, thanks !

Thank you!!

tlapusan commented 11 months ago

@SPDA36 looking into the source code and your stack trace, I do remember that the rtree_feature_space viz method is not supported yet by xgboost. Would it help if you try it with sklearn library ?

I will give it a try on weekend to see if I can adapt it to xgboost also, but there were some issues to extract that information (min_sample_leaf) from xgboost tree metadata.

SPDA36 commented 11 months ago

@tlapusan I have used .rtree_feature_space() with sklearn decision trees and random forest with zero issues. No worries if it wont work with xgboost, but I figured I would try since .rtree_feature_space() is such a cool feature.