stanfordmlgroup / ngboost

Natural Gradient Boosting for Probabilistic Prediction
Apache License 2.0
1.64k stars 214 forks source link

NGBoost for multi-class classification problems? #173

Closed kdlin closed 4 years ago

kdlin commented 4 years ago

Are there any benefits of using NGBoost for multi-class classification problems? Other algorithms like XGBoost and LightGBM can output probabilities for each class. Is there a reason to use NGBoost? Will it out perform the others?

alejandroschuler commented 4 years ago

There is no practical reason to favor NGBoost over other boosting algorithms for classification (binary or multiclass). Other packages implement a variety of small optimizations for speed and performance which sometimes add up in a particular problem, so I'd expect the performance of NGBoost to be a tiny bit worse than that of another boosting algorithm in head-to-head comparison on classification tasks. It may be that NGBoost has better calibration - I do not know, and this would be interesting to test.

There is one aesthetic reason to use NGBoost for classification, and that is because the right way to do classification in NGBoost comes naturally from its formulation as a distributional estimation problem. In traditional boosting algorithms, you explicitly or implicitly had to make a choice about what splitting rule you were going to use and how you were going to hack the algorithm to allow for multi-class prediction. All of those problems have been efficiently and practically solved many times over by effective hacks and heuristics such that things just work from the user's perspective. In NGBoost, there are no arbitrary choices and heuristics. Everything follows naturally after defining the score and distribution. So if you want a boosting algorithm for classification that has good feng shui then NGBoost is the way to go I suppose.