A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Refitting quantile regression model does not work when the target scale is different from 1.
In particular, if the target is sufficiently larger than the gradient of the pinball loss function, typically the target y >> 1 or y << 1, the quantile regression refit will not work.
I think this is due to the fact that the algorithm for determining the output of each leaf is different between the normal quantile regression model construction and refitting.
During normal training, the output of a leaf is the quantile value of the data set contained in each leaf, rather than gradient boosting as described here. On the other hand, in refit, the leaf values are updated by gradient boosting maybe here.
The gradient of the pinball loss is a constant between -1~1 independent of target size, so convergence is slow when the target is sufficiently different from 1. Since no new trees are added during refitting, it is not possible to converge if the original model does not have a sufficient number of estimators.
Reproducible example
import numpy as np
import lightgbm as lgb
from matplotlib import pyplot as plt
# setup for input data
X_shape = (10000, 1)
y_scale = 1 # to see an example that refit doesn't work, change this value to 0.01 or 10.
# create input data
X = np.random.random(X_shape)
y_true = np.sin(10 * X[:,0])
y = y_true + 0.5 * np.random.normal(0, 1, X_shape[0])
y = y_scale * y
# train 90% quantile model in ordinal way
q_90_params = {
"objective": "quantile",
"alpha": 0.9,
"verbose": -1,
}
lgbm_q_90 = lgb.train(params=q_90_params, train_set=lgb.Dataset(X, y))
# train 50% quantile model in ordinal way
q_50_params = {
"objective": "quantile",
"alpha": 0.5,
"verbose": -1,
}
lgbm_q_50 = lgb.train(params=q_50_params, train_set=lgb.Dataset(X, y))
# train 90% quantile model by refitting pre-trained 50% quantile model
lgbm_q_90_refitted = lgb.train(params=q_50_params, train_set=lgb.Dataset(X, y))
lgbm_q_90_refitted.params['alpha']=0.9
lgbm_q_90_refitted = lgbm_q_90_refitted.refit(X, y, decay_rate=0.) # set decay_rate=0 to completely replace the output of each leaf with new value.
# plot output
fig, ax = plt.subplots()
ax.scatter(X[:,0], y, label='observed y', s=1)
ax.scatter(X[:,0], lgbm_q_90.predict(X), label='90% quantile', color="orange", s=1)
# Shift the 50% quantile value a little so you can see the overlap.
ax.scatter(X[:,0], lgbm_q_50.predict(X) - 0.1 * y_scale , label='50% quantile', color="black", s=1)
ax.scatter(X[:,0], lgbm_q_90_refitted.predict(X), label='refitted 90% quantile', color="red", s=1)
plt.legend(markerscale=5)
plt.show()
When we scale the target value by 1, refit work well. (modify y_scale to 1 at the beginning of the code above )
If we scale the target by a factor of 10, the refit will not work.
Scaling by 0.01 does not work for refit as well.
I defined the predicted difference ratio $r$ as in the equation below and plotted its dependence on the target scale.
r = ( y_{refit} - y_{train}) / y_{scale}
$y{refit}$ and $y{train}$ are a predicted 90% quantile value by refitted model and ordinally trained model. $y_{scale}$ is a scale of the target.
As the target scale becomes larger or smaller than 1, the quantile prediction value by the refitted model and one by the normal model diverge.
Description
Refitting quantile regression model does not work when the target scale is different from 1. In particular, if the target is sufficiently larger than the gradient of the pinball loss function, typically the target y >> 1 or y << 1, the quantile regression refit will not work.
I think this is due to the fact that the algorithm for determining the output of each leaf is different between the normal quantile regression model construction and refitting. During normal training, the output of a leaf is the quantile value of the data set contained in each leaf, rather than gradient boosting as described here. On the other hand, in refit, the leaf values are updated by gradient boosting maybe here.
The gradient of the pinball loss is a constant between -1~1 independent of target size, so convergence is slow when the target is sufficiently different from 1. Since no new trees are added during refitting, it is not possible to converge if the original model does not have a sufficient number of estimators.
Reproducible example
When we scale the target value by 1, refit work well. (modify y_scale to 1 at the beginning of the code above ) If we scale the target by a factor of 10, the refit will not work. Scaling by 0.01 does not work for refit as well. I defined the predicted difference ratio $r$ as in the equation below and plotted its dependence on the target scale.
$y{refit}$ and $y{train}$ are a predicted 90% quantile value by refitted model and ordinally trained model. $y_{scale}$ is a scale of the target. As the target scale becomes larger or smaller than 1, the quantile prediction value by the refitted model and one by the normal model diverge.