jinlow / forust

A lightweight gradient boosted decision tree package.
https://jinlow.github.io/forust/
Apache License 2.0
56 stars 6 forks source link

Max leaves not always honored #32

Closed jinlow closed 9 months ago

jinlow commented 1 year ago

Right now, the max_leaves criteria is being checked at the start of splitting a new node. It simply verifies if we have created more leaves than max leaves. Because of this, you could hit a scenario where you have max leaves set to 5, at the top of the node split loop there are only 4 leaves, the node is split, and then there are 6 leaves, at which point training for that tree would break. We should add an attribute to each splitter such as new_nodes_added() where we can check this, and add it to the current number of leaves to determine if splitting this node will push us over the max_leaves.

jinlow commented 9 months ago

Closed with #81