Right now, the max_leaves criteria is being checked at the start of splitting a new node. It simply verifies if we have created more leaves than max leaves. Because of this, you could hit a scenario where you have max leaves set to 5, at the top of the node split loop there are only 4 leaves, the node is split, and then there are 6 leaves, at which point training for that tree would break.
We should add an attribute to each splitter such as new_nodes_added() where we can check this, and add it to the current number of leaves to determine if splitting this node will push us over the max_leaves.
Right now, the
max_leaves
criteria is being checked at the start of splitting a new node. It simply verifies if we have created more leaves than max leaves. Because of this, you could hit a scenario where you have max leaves set to 5, at the top of the node split loop there are only 4 leaves, the node is split, and then there are 6 leaves, at which point training for that tree would break. We should add an attribute to each splitter such asnew_nodes_added()
where we can check this, and add it to the current number of leaves to determine if splitting this node will push us over themax_leaves
.