Shrinkage as a vector and learning rate decay

harrysouthworth / gbm

Gradient boosted models

Other

106 stars 27 forks source link

Shrinkage as a vector and learning rate decay #36

Closed DexGroves closed 9 years ago

DexGroves commented 9 years ago

Some changes to the c++ to allow the user to give shrinkage as a vector, or to supply shrinkage.decay as a parameter. The general idea is let gbm learn quickly at first but slower as trees get added.

All old code will (AKA should) still work, I've tested this on a bunch of Gaussian and Laplacian models. Giving shrinkage as a non-vector, gbm.fit and gbm.more all work just fine.

az0 commented 9 years ago

Have you tested how this improves performance? Two years ago I suggested adding momentum to gbm and got a response that "It turns out that it doesn't work." If it really does work, it could save a lot of time.

harrysouthworth commented 9 years ago

Yep. https://gradientboostedmodels.googlecode.com/files/report.pdf We found that it quickly descends into a local minimum and then can't get out, so having small shrinkage from the outset outperforms. Is your experience different?

DexGroves commented 9 years ago

This is an interesting read, thanks for the link! I didn't know this had been tried before.

harrysouthworth commented 9 years ago

It's implemented in here: https://github.com/harrysouthworth/gbt So far as I recall, only Gaussian and binomial are in there, though