cerlymarco / linear-tree

A python library to build Model Trees with Linear Models at the leaves.
MIT License
338 stars 54 forks source link

How is the value to split at chosen? #41

Closed ZmeiGorynych closed 2 months ago

ZmeiGorynych commented 2 months ago

Hi, let's say you're considering to split a node along a certain float-valued dimension. How do you choose the candidate split values (that is, the value to which you compare the column values to decide if they end up in the right or left subtree)? To choose among the candidates, you compare the error values - but how do you choose the candidates themselves?

cerlymarco commented 2 months ago

Hi, the candidate splits are chosen extracting quantiles from each feature distribution: https://github.com/cerlymarco/linear-tree/blob/2982edc050206521fa9cde7df1b1f88ab7b2183d/lineartree/_classes.py#L336-L341

If u support the project don't forget to leave a star ;-) all the best

ZmeiGorynych commented 2 months ago

Perfect, thanks a lot! I'm not using linear-tree directly right now, but am doing something very similar here, so just wanted to compare methodologies :)