Closed DexGroves closed 9 years ago
Have you tested how this improves performance? Two years ago I suggested adding momentum to gbm and got a response that "It turns out that it doesn't work." If it really does work, it could save a lot of time.
Yep. https://gradientboostedmodels.googlecode.com/files/report.pdf We found that it quickly descends into a local minimum and then can't get out, so having small shrinkage from the outset outperforms. Is your experience different?
This is an interesting read, thanks for the link! I didn't know this had been tried before.
It's implemented in here: https://github.com/harrysouthworth/gbt So far as I recall, only Gaussian and binomial are in there, though
Some changes to the c++ to allow the user to give shrinkage as a vector, or to supply shrinkage.decay as a parameter. The general idea is let gbm learn quickly at first but slower as trees get added.
All old code will (AKA should) still work, I've tested this on a bunch of Gaussian and Laplacian models. Giving shrinkage as a non-vector, gbm.fit and gbm.more all work just fine.