curso-r / treesnip

Parsnip backends for `tree`, `lightGBM` and `Catboost`
https://curso-r.github.io/treesnip
GNU General Public License v3.0
85 stars 13 forks source link

max_depth is constraining num_leaves for deep trees #57

Closed dfsnow closed 2 years ago

dfsnow commented 2 years ago

LightGBM grows leaf-wise, so it's possible to get a very unbalanced tree. Depending on your data, it's completely possible to end up with a tree that has fewer than 2^17 leaves but a depth much greater than 17.

I think the lines below are intended to cap the number of leaves at the max (2^17), but they're actually capping the depth, NOT the total number of leaves. This seems like a fairly critical bug since you're inadvertently limiting the possible depth of trees.

https://github.com/curso-r/treesnip/blob/3e4c2207cd137f1764b10ea019cea2218875bc48/R/lightgbm.R#L258-L261

Edit: Just realized that max_depth is never assigned back to arg_list so it doesn't actually cap anything it, just throws a warning.