Closed arumds closed 1 month ago
Hey. Thanks for using LightGBM. Can you try setting the condition to greater equal? i.e.
grad = np.where(residual >= 0, 0.5, -0.5)
@jmoralez tried setting to grad = np.where(residual >= 0, 0.5, -0.5)
params={
"objective": median_loss,
},
cv_result = lgb.cv(params, dtrain, nfold=n_folds, metrics='rmse', stratified=False, return_cvbooster=True)
Log:
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements
[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements
[1] cv_agg's train rmse: 4.66734 + 0.00107263 cv_agg's valid rmse: 4.66734 + 0.00428721
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
[LightGBM] [Warning] Stopped training because there are no more leaves that meet the split requirements
[LightGBM] [Warning] No further splits with positive gain, best gain: -inf
When tried to debug the median_loss objective at the execution of lgb.cv(), the pred are all zero as seen in the screenshot:
With the obective='regression' the model gets trained normally. Logs are below:
1] cv_agg's train rmse: 0.730986 + 0.000761274 cv_agg's valid rmse: 0.730999 + 0.00305853
[2] cv_agg's train rmse: 0.724106 + 0.000747364 cv_agg's valid rmse: 0.724126 + 0.00305247
[3] cv_agg's train rmse: 0.717755 + 0.000743182 cv_agg's valid rmse: 0.717786 + 0.00304095
[4] cv_agg's train rmse: 0.711056 + 0.000728518 cv_agg's valid rmse: 0.711092 + 0.00303802
[5] cv_agg's train rmse: 0.704382 + 0.000716823 cv_agg's valid rmse: 0.704426 + 0.00302899
[6] cv_agg's train rmse: 0.69778 + 0.00070809 cv_agg's valid rmse: 0.697832 + 0.00301913
[7] cv_agg's train rmse: 0.691297 + 0.000700247 cv_agg's valid rmse: 0.691353 + 0.00301123
[8] cv_agg's train rmse: 0.685269 + 0.000683244 cv_agg's valid rmse: 0.685337 + 0.00301251
[9] cv_agg's train rmse: 0.678915 + 0.000665435 cv_agg's valid rmse: 0.678987 + 0.00301451
[10] cv_agg's train rmse: 0.672621 + 0.000661577 cv_agg's valid rmse: 0.672699 + 0.00300223
[11] cv_agg's train rmse: 0.666394 + 0.000655792 cv_agg's valid rmse: 0.666477 + 0.00299132
When using a custom objective LightGBM sets the init score as 0 and if it doesn't find a gain with any split you may be left with a single tree with only the root, you can verify this if you use the trees_to_dataframe method.
If you're able to provide a reproducible example we can assist further. The following seems to train normally:
import lightgbm as lgb
import numpy as np
from sklearn.datasets import make_regression
def median_loss(preds, train_data: lgb.Dataset):
y_true = train_data.get_label()
residual = preds - y_true
grad = np.where(residual >= 0, 0.5, -0.5)
hess = np.ones_like(grad) # Hessian is constant for median pinball loss
return grad, hess
X, y = make_regression(n_samples=1000, n_features=2)
dtrain = lgb.Dataset(X, y)
params={"objective": median_loss, 'num_leaves': 32, 'verbosity': -1, 'metrics': 'l2'}
cv_hist = lgb.cv(
params,
dtrain,
num_boost_round=10,
nfold=2,
stratified=False,
callbacks=[lgb.log_evaluation(1)],
)
# [1] cv_agg's valid l2: 15698.8 + 269.489
# [2] cv_agg's valid l2: 15689.7 + 269.239
# [3] cv_agg's valid l2: 15680.5 + 268.99
# [4] cv_agg's valid l2: 15671.4 + 268.741
# [5] cv_agg's valid l2: 15662.2 + 268.491
# [6] cv_agg's valid l2: 15653.1 + 268.242
# [7] cv_agg's valid l2: 15644 + 267.993
# [8] cv_agg's valid l2: 15634.8 + 267.744
# [9] cv_agg's valid l2: 15625.7 + 267.495
# [10] cv_agg's valid l2: 15616.6 + 267.246
@jmoralez Attached is a test dtrain binary file which can be used to reproduce as below:
dataset_from_file = lgb.Dataset(data="test.bin")
params={"objective": median_loss, 'num_leaves': 32, 'verbosity': -1, 'metrics': 'l2'}
cv_hist = lgb.cv(
params,
dataset_from_file,
num_boost_round=10,
nfold=2,
stratified=False,
callbacks=[lgb.log_evaluation(1)],
seed=0,
metrics='rmse',
eval_train_metric=True,
return_cvbooster=True)
Unzip the file to test.bin
Did you inspect the produced trees?
You mean to get the model from lgb.train after lgb.cv and inspect the trees? If so, yes there seem to be only root.
The hyper_params from the lgb.cv() and BayesianOptimization returns
`{'num_iterations': 500, 'early_stopping_rounds': 50, 'bagging_freq': 1, 'learning_rate': 0.01, 'verbosity': -1, 'monotone_constraints': [0, 0, 0, -1, 0, 1], 'objective': <function median_loss at 0x3126261f0>, 'bagging_fraction': 0.8646440511781974, 'feature_fraction': 0.9145568099117258, 'lambda_l1': 0.006027633760716439, 'lambda_l2': 0.005448831829968969, 'max_depth': 14, 'min_child_weight': 0.6394705825246829, 'min_data_in_leaf': 16, 'min_gain_to_split': 0.045670920031283195, 'num_leaves': 292}`
The model is trained with these hyper params and yields:
lgb.Booster.trees_to_dataframe(model)
Out[5]:
tree_index node_depth node_index left_child right_child parent_index \
0 0 1 0-L0 None None None
split_feature split_gain threshold decision_type missing_direction \
0 None None None None None
missing_type value weight count
0 None 0 None None
Does this indicate that the median_loss objective is not good for the dataset?
That means LightGBM isn't able to find a split that satisfies the constraints you've set (min_gain_to_split
, min_data_in_leaf
, min_child_weight
, etc).
This doesn't seem to be an issue within LightGBM or your custom loss, I'm pretty sure you'd get the same result if you used the built-in loss (single tree with only the root which predicts the init score).
If you have very few samples you could try getting more data or reducing the constraints (in case 16 is your minimum min_data_in_leaf
for example)
@jmoralez The hyper parameter boundaries for tuning are shown below:
hyperparam_boundaries = {'num_leaves': (100, 300),
'max_depth': (10, 20),
'feature_fraction': (0.7, 1),
'bagging_fraction': (0.7, 1),
'min_data_in_leaf': (10, 25),
'min_gain_to_split': (0.01,0.05),
'lambda_l1': (0, 0.01),
'lambda_l2': (0, 0.01)
}
And the built-in regression objective gives the best hyper parameters by bayesian hyper param tuning with lgb.cv() cross validation:
{'num_iterations': 500, 'early_stopping_rounds': 50, 'bagging_freq': 1, 'learning_rate': 0.01, 'verbosity': -1, 'monotone_constraints': [0, 0, 0, -1, 0, 1], 'objective': 'regression', 'bagging_fraction': 0.8150324556477333, 'feature_fraction': 0.9375175114247993, 'lambda_l1': 0.005288949197529045, 'lambda_l2': 0.0056804456109393235, 'max_depth': 19, 'min_child_weight': 0.07041859401008829, 'min_data_in_leaf': 11, 'min_gain_to_split': 0.010808735897613029, 'num_leaves': 266}
And there are >1 trees
lgb.Booster.trees_to_dataframe(model)
Out[2]:
tree_index node_depth node_index ... value weight count
0 0 1 0-S0 ... 4.607710 0.0 66367
1 0 2 0-S2 ... 4.615160 29156.0 29156
2 0 3 0-S7 ... 4.616940 17398.0 17398
3 0 4 0-S18 ... 4.618880 2726.0 2726
4 0 5 0-S53 ... 4.621150 455.0 455
... ... ... ... ... ... ... ...
265495 499 10 499-L241 ... -0.000076 20.0 20
265496 499 10 499-L256 ... 0.000423 11.0 11
265497 499 7 499-S254 ... -0.000418 25.0 25
265498 499 8 499-L38 ... -0.000174 12.0 12
265499 499 8 499-L255 ... -0.000677 13.0 13
The issue occurs only while using custom loss function where it cannot find a split and only predicts the init score 0.
@jmoralez is there anything i am missing out here?
What are you returning as the trial's score? As I said, when using a custom objective, LightGBM starts boosting from zero, which might hurt the convergence.
Can you try the approach in https://github.com/microsoft/LightGBM/issues/5114#issuecomment-1084994020 by setting the init score in your dataset (to the target's median in this case), adding it back to your predictions and then computing your metric on that? If you're using a built-in metric it won't work because it won't take into account the init scores.
This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!
LightGBM version 4.0.0
The objective='regression' trains to predict the mean representation of the data. And i am interested to train to predict median representation of the actual values. Infact, a quanitle model with alpha=0.5 will solve the problem. However, the quantile model does not work with
monotone_constraints
parameter which is essential in our case. Therefore, a custommedian_loss
is used as objective passed to the params.Debugging shows that all predictions during the lgb.cv step are 0's and therefore the gradients are uniform. It might not be providing LightGBM with sufficient gradient information to make meaningful splits.
Does anyone have a suggestion on how to train the model effectively with medain_loss custom objective or with a quantile objective preserving the monotonic constraint. @jameslamb @vladv14