parrt / dtreeviz

A python library for decision tree visualization and model interpretation.
MIT License
2.89k stars 332 forks source link

Crash when leaf nodes have no samples #305

Open taltstidl opened 10 months ago

taltstidl commented 10 months ago

Thanks for the great library. The visualizations are truly great. However, we're running into a crash when some of the leaf nodes do not have any samples in the X_train and y_train data. What seems to happen is the following:

Two possible solutions to this:

  1. https://github.com/parrt/dtreeviz/blob/a3d02a5fd382bb8e70af92cc9110d9ef73bc86a0/dtreeviz/trees.py#L1240 Here, render an empty SVG or similar so it can later be rendered (this would be my preferred option).
  2. https://github.com/parrt/dtreeviz/blob/a3d02a5fd382bb8e70af92cc9110d9ef73bc86a0/dtreeviz/trees.py#L607 Here, do not add a leaf if no file could be created (would likely need to add a suitable return value to _class_leaf_viz).
tlapusan commented 9 months ago

Hi @taltstidl ,

Thanks for raising this issue. There was some work related to it but it wasn't done completely. ex. https://github.com/parrt/dtreeviz/pull/299

I will take a look on it in the next days, it seems to be an issue also when the split nodes don't have samples.