gbm-developers / gbm

Gradient boosted models (the old gbm package)
Other
51 stars 27 forks source link

Use weights instead of observations as stopping criterion for terminal nodes #48

Closed benmarchi closed 4 years ago

benmarchi commented 4 years ago

I was wondering if it is (or would be) possible to use a weight-based measure instead of an observation count for determining terminal nodes. In particular, is there an efficient way to implement an equivalent of n.minobsinnode with weights? One possible workaround involves replicating the rows in the original data.frame based on normalized weights (weights <- weights/min(weights)) and fitting the GBM using this new, expanded data.frame. The downside of this method is that it can explode the size of the problem, so I am looking for something that will not change the structure of the underlying data.

bgreenwell commented 4 years ago

Hi @benmarchi. Unfortunately I doubt it. This would (I think) require changes to the C++ code underneath which I’m hesitant to do since this package is only being maintained for backwards compatibility and bug fixes. However, this would be appropriate for the gbm3 📦, but I don’t think it’s actively maintained at the moment!

benmarchi commented 4 years ago

@bgreenwell Thanks for following up. I figured it was a long shot that there would be an easy fix.