cerlymarco / linear-tree

A python library to build Model Trees with Linear Models at the leaves.
MIT License
351 stars 54 forks source link

Performing Split on Node with Perfect Results #14

Closed sik-flow closed 2 years ago

sik-flow commented 2 years ago

I have an example where it performs a split on a node a node with a loss of 0. Take a look at the below example. It performs a split on node 1 (where the loss = 0). This split does not add any value to the results and the parent node (node 1) already gives perfect results.

Is this the intended behavior? Or should it not perform splits when the results are already perfect?

linear_tree
cerlymarco commented 2 years ago

Hi thanks for your feedback.

This is not the intended behavior. The splits are computed until there is utility (as you can see from the source code).

I try to reproduce a similar case where I build a LinearTreeRegressor on this data:

image

The fitted model results in:

image

In node1 there is no utility in splitting so it is considered as a leaf. It's evident also by looking at the prediction on the training set:

image

Maybe in your case, the loss of node1 is not exactly 0 (in the plot it's rounded).

Here the running notebook.

If you support the project don't forget to leave a star ;-)

EDIT: This is may also due to the numeric precision of your environment... where a loss of (for example) 5.429976129669105e-29 is not equal to 0.0 so the tree continues to grow. This is automatically limited (setting a fixed rounding precision) in lineartree>=0.3.4