stanfordmlgroup / ngboost

Natural Gradient Boosting for Probabilistic Prediction
Apache License 2.0
1.65k stars 215 forks source link

NGBClassifier use DecisionTreeRegressor by default #53

Closed Templarrr closed 4 years ago

Templarrr commented 4 years ago

default_tree_learner is DecisionTreeRegressor with friedman_mse criterion, which is kinda weird to use for classification. I may be a bit confused here but is it really fine to use Regressor and not Classifier as a base class? It may be by design, but looks really weird.

alejandroschuler commented 4 years ago

Hey @Templarrr, this is by design. The trees are estimating the probability p(Y=1|X=x) for each x (the one parameter of the conditional bernoulli distributions), which are continuous numbers between 0 and 1. That's part of what is so elegant about ngboost: instead of requiring different formalisms for classification, regression, and survival problems, we assume that Y|X=x has a particular form of distribution (normal, bernoulli, etc.) and reduce the problem to estimating the parameters of that distribution as a function of x.