Base score for custom multi-output objects is hard-coded at 0.5

david-cortes commented 11 months ago

EDIT: sorry, first version of this issue was wrong, I've now edited it.

When using multi-output objectives with multi-output trees, the intercept / base score / first tree seems to start at 0.5 regardless of what the labels are.

In this example, I am passing a multi-output label in which the second column is the negative of the first one:

import numpy as np, xgboost as xgb
mtcars = np.array([[21,6,160,110,3.9,2.62,16.46,0,1,4,4],
[21,6,160,110,3.9,2.875,17.02,0,1,4,4],
[22.8,4,108,93,3.85,2.32,18.61,1,1,4,1],
[21.4,6,258,110,3.08,3.215,19.44,1,0,3,1],
[18.7,8,360,175,3.15,3.44,17.02,0,0,3,2],
[18.1,6,225,105,2.76,3.46,20.22,1,0,3,1],
[14.3,8,360,245,3.21,3.57,15.84,0,0,3,4],
[24.4,4,146.7,62,3.69,3.19,20,1,0,4,2],
[22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2],
[19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4],
[17.8,6,167.6,123,3.92,3.44,18.9,1,0,4,4],
[16.4,8,275.8,180,3.07,4.07,17.4,0,0,3,3],
[17.3,8,275.8,180,3.07,3.73,17.6,0,0,3,3],
[15.2,8,275.8,180,3.07,3.78,18,0,0,3,3],
[10.4,8,472,205,2.93,5.25,17.98,0,0,3,4],
[10.4,8,460,215,3,5.424,17.82,0,0,3,4],
[14.7,8,440,230,3.23,5.345,17.42,0,0,3,4],
[32.4,4,78.7,66,4.08,2.2,19.47,1,1,4,1],
[30.4,4,75.7,52,4.93,1.615,18.52,1,1,4,2],
[33.9,4,71.1,65,4.22,1.835,19.9,1,1,4,1],
[21.5,4,120.1,97,3.7,2.465,20.01,1,0,3,1],
[15.5,8,318,150,2.76,3.52,16.87,0,0,3,2],
[15.2,8,304,150,3.15,3.435,17.3,0,0,3,2],
[13.3,8,350,245,3.73,3.84,15.41,0,0,3,4],
[19.2,8,400,175,3.08,3.845,17.05,0,0,3,2],
[27.3,4,79,66,4.08,1.935,18.9,1,1,4,1],
[26,4,120.3,91,4.43,2.14,16.7,0,1,5,2],
[30.4,4,95.1,113,3.77,1.513,16.9,1,1,5,2],
[15.8,8,351,264,4.22,3.17,14.5,0,1,5,4],
[19.7,6,145,175,3.62,2.77,15.5,0,1,5,6],
[15,8,301,335,3.54,3.57,14.6,0,1,5,8],
[21.4,4,121,109,4.11,2.78,18.6,1,1,4,2]])
y = mtcars[:, 0]
X = mtcars[:, 1:]

def rmse_obj(predt, dtrain):
    print(predt[:5])
    raise ValueError()
    y = dtrain.get_label().reshape(predt.shape)
    grad = (y - predt).reshape(-1)
    hess = np.ones_like(grad)
    return grad,hess

dm = xgb.DMatrix(data=X, label=np.c_[y, -y])
model = xgb.train(
    dtrain=dm,
    params={
        "tree_method": "hist",
        "multi_strategy": "multi_output_tree",
    },
    num_boost_round=3,
    obj=rmse_obj
)

[[0.5 0.5]
 [0.5 0.5]
 [0.5 0.5]
 [0.5 0.5]
 [0.5 0.5]]

Both have the same score of 0.5, which doesn't look like it'd be a better choice than zero, for example.

trivialfis commented 11 months ago

Indeed, when a custom objective is used, the intercept cannot be fitted by XGBoost since the intercept is fitted according to the objective. For instance, there is a close solution for MAE (the median). Users need to set the base_score parameter manually when custom objective is used.

david-cortes commented 11 months ago

But in that case, why not set it to zero instead of 0.5? Or why not add a parameter (ideally turned on by default) to try to obtain it through Newton steps if base_score is not supplied by the user?

trivialfis commented 11 months ago

I think the 0.5 choice was due to logistic regression with sigmoid. It's an old default that nobody touches. I added the Newton step in the recent releases, but did not change the default values when the Newton step is not applicable. In summary, "historical reasons".

Your suggestion is interesting, I can look into intercept fitting with custom objective by using one step Newton in the future.

david-cortes commented 11 months ago

As a quick workaround, wouldn't it be better to at least leave it at zero?

In the case of logistic regression with sigmoid, that's what would output a probability of 0.5, as opposed to a raw score of 0.5 which is a probability of 0.62.

trivialfis commented 11 months ago

The inverse of the sigmoid is logit, which turns 0.5 into 0.0. That's my guess anyway.

As a quick workaround, wouldn't it be better to at least leave it at zero?

I think the default value probably has very little meaning for regression output where the mean can be anything. I can work on custom objectives.

dmlc / xgboost

Base score for custom multi-output objects is hard-coded at 0.5 #9840