cerlymarco / linear-tree

A python library to build Model Trees with Linear Models at the leaves.
MIT License
351 stars 54 forks source link

LinearTreeClassifier with Gini Index of 0 #16

Closed sik-flow closed 2 years ago

sik-flow commented 2 years ago

I am using the LinearTreeClassifier and ran into an issue where it throws an error due to the split having a Gini Index of 0 and only a single class in the node. See below

Screen Shot 2022-02-11 at 8 36 00 AM

When looking at the splits with a decision tree we see the following:

Screen Shot 2022-02-11 at 8 36 10 AM

The colab notebook I used to create this issue is here: https://colab.research.google.com/drive/1NLWKZItwdRCmt6Dxmesqu75DLkXaJvs6?usp=sharing

cerlymarco commented 2 years ago

Hi, thanks for your feedback.

I think this happens because u are using a LinearTreeRegressor for a classification problem. If u use a LinearTreeClassifier (together with LogisticRegression) this shouldn't happen since in each partition split the class distributions are checked and the classifiers replaced by DummyEstimator in presence of a single class (as u can see here)

All the best