microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.69k stars 3.83k forks source link

Refitting quantile regression model does not work when the target scale is different from 1 #5760

Open takeyama0 opened 1 year ago

takeyama0 commented 1 year ago

Description

Refitting quantile regression model does not work when the target scale is different from 1. In particular, if the target is sufficiently larger than the gradient of the pinball loss function, typically the target y >> 1 or y << 1, the quantile regression refit will not work.

I think this is due to the fact that the algorithm for determining the output of each leaf is different between the normal quantile regression model construction and refitting. During normal training, the output of a leaf is the quantile value of the data set contained in each leaf, rather than gradient boosting as described here. On the other hand, in refit, the leaf values are updated by gradient boosting maybe here.

The gradient of the pinball loss is a constant between -1~1 independent of target size, so convergence is slow when the target is sufficiently different from 1. Since no new trees are added during refitting, it is not possible to converge if the original model does not have a sufficient number of estimators.

Reproducible example

import numpy as np
import lightgbm as lgb
from matplotlib import pyplot as plt

# setup for input data
X_shape = (10000, 1)
y_scale = 1 # to see an example that refit doesn't work, change this value to 0.01 or 10.

# create input data
X = np.random.random(X_shape)
y_true = np.sin(10 * X[:,0])
y = y_true +  0.5 * np.random.normal(0, 1, X_shape[0])
y = y_scale * y

# train 90% quantile model in ordinal way
q_90_params = {
    "objective": "quantile",
    "alpha": 0.9,
    "verbose": -1,
}
lgbm_q_90 = lgb.train(params=q_90_params, train_set=lgb.Dataset(X, y))

# train 50% quantile model in ordinal way
q_50_params = {
    "objective": "quantile",
    "alpha": 0.5,
    "verbose": -1,
}
lgbm_q_50 = lgb.train(params=q_50_params, train_set=lgb.Dataset(X, y))

# train 90% quantile model by refitting pre-trained 50% quantile model
lgbm_q_90_refitted = lgb.train(params=q_50_params, train_set=lgb.Dataset(X, y))
lgbm_q_90_refitted.params['alpha']=0.9
lgbm_q_90_refitted = lgbm_q_90_refitted.refit(X, y, decay_rate=0.) # set decay_rate=0 to completely replace the output of each leaf with new value.

# plot output
fig, ax = plt.subplots()
ax.scatter(X[:,0], y, label='observed y', s=1)
ax.scatter(X[:,0], lgbm_q_90.predict(X), label='90% quantile', color="orange", s=1)
# Shift the 50% quantile value a little so you can see the overlap.
ax.scatter(X[:,0], lgbm_q_50.predict(X) - 0.1 * y_scale , label='50% quantile', color="black", s=1)
ax.scatter(X[:,0], lgbm_q_90_refitted.predict(X), label='refitted 90% quantile', color="red", s=1)
plt.legend(markerscale=5)
plt.show()

When we scale the target value by 1, refit work well. (modify y_scale to 1 at the beginning of the code above ) quantile_refit_scale_1 If we scale the target by a factor of 10, the refit will not work. quantile_refit_scale_10 Scaling by 0.01 does not work for refit as well. quantile_refit_scale_0 01 I defined the predicted difference ratio $r$ as in the equation below and plotted its dependence on the target scale.

 r = ( y_{refit} - y_{train}) / y_{scale}

$y{refit}$ and $y{train}$ are a predicted 90% quantile value by refitted model and ordinally trained model. $y_{scale}$ is a scale of the target. As the target scale becomes larger or smaller than 1, the quantile prediction value by the refitted model and one by the normal model diverge.

prediction_difference

jameslamb commented 10 months ago

Thank you for the excellent write-up, and sorry it's taken so long to get you a response!

@jmoralez is this something you could look into?