microsoft / LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
https://lightgbm.readthedocs.io/en/latest/
MIT License
16.53k stars 3.82k forks source link

[python-package] early_stopping callback can not work in cross validation #6093

Closed lostpinky closed 10 months ago

lostpinky commented 1 year ago

I trained my mode like this:

if params is None:
        params = {
            'objective': 'binary',
            'metric': 'auc',
            'boosting_type': 'gbdt',
            'num_leaves': 31,
            'max_depth': 5,
            'learning_rate': 0.05,
            'feature_fraction': 0.8,
            'bagging_fraction': 0.8,
            'bagging_freq': 5,
            'n_jobs': n_jobs,
            'random_state': random_state,
            'verbose': -1,
        }

kf = StratifiedKFold(n_splits=n_splits, random_state=random_state, shuffle=True)

feas = list(feature_name)

callbacks = [log_evaluation(period=100), early_stopping(stopping_rounds=300)]

# 模型训练
for fold_, (train_index, valid_index) in enumerate(kf.split(train_scaler_X, y)):
    print("----------------------------- fold n°{}--------------------------".format(fold_ + 1))
    X_train, X_valid = train_scaler_X.loc[train_index, feature_name], train_scaler_X.loc[valid_index, feature_name]
    y_train, y_valid = y[train_index], y[valid_index]

    # 定义数据集
    train_set = lgb.Dataset(X_train, y_train, feature_name=feas,
                            )
    val_set = lgb.Dataset(X_valid, y_valid, feature_name=feas,
                          reference=train_set
                          )

    lgb_model = lgb.train(params,
                          train_set,
                          valid_sets=[train_set,val_set],
                          num_boost_round=10000,
                          callbacks=callbacks
                          )
    ovr_oof[valid_index] = lgb_model.predict(X_valid, num_iteration=lgb_model.best_iteration)

    print(lgb_model.best_iteration)
    print(ovr_oof[valid_index])
    ovr_oof_auc = roc_auc_score(y_valid, ovr_oof[valid_index])

    print(f"valid_{fold_ + 1}'s auc : {ovr_oof_auc}")

    # ovr_preds_list.append(lgb_model.predict(test_scaler_X, num_iteration=lgb_model.best_iteration))
    # valid_score.append(roc_auc_score(y[valid_index, num_classes[i]], ovr_oof[valid_index, i]))
    ovr_preds += lgb_model.predict(test_scaler_X[feature_name], num_iteration=lgb_model.best_iteration) / n_splits

    importance += lgb_model.feature_importance(importance_type='gain')

But the log is not what I want:

----------------------------- fold n°1--------------------------
Training until validation scores don't improve for 300 rounds
[1000]  training's auc: 0.966012    valid_1's auc: 0.91076
Early stopping, best iteration is:
[1165]  training's auc: 0.971237    valid_1's auc: 0.911591
[0.84220696 0.7574774  0.22584328 ... 0.60741325 0.38295108 0.03934999]
valid_1's auc : 0.9115913658294483
----------------------------- fold n°2--------------------------
[1000]  training's auc: 0.966005    valid_1's auc: 0.911484
Early stopping, best iteration is:
[1165]  training's auc: 0.971237    valid_1's auc: 0.911591
[0.93620908 0.28785294 0.93086785 ... 0.00793295 0.07906723 0.01210027]
valid_2's auc : 0.9112367655451452
----------------------------- fold n°3--------------------------
[1000]  training's auc: 0.966456    valid_1's auc: 0.91038
Early stopping, best iteration is:
[1165]  training's auc: 0.971237    valid_1's auc: 0.911591
[0.15974756 0.07401902 0.45673908 ... 0.07444335 0.47290451 0.47817144]
valid_3's auc : 0.9103019356098512

The number of early stopping iterations cannot be exactly the same across all folds. I don't know the reason for this. Then I tried another approach to implement early stopping. I used the early_stopping_rounds parameter instead of a callback. And this worked well.The early stopping iteration counts are no longer the same for each fold.

I want to understand what caused this, and whether it's because there's a bug in the early stopping callback function.

More information:

jameslamb commented 1 year ago

Thanks for using LightGBM.

Can you please provide the information that was asked for when you clicked New Issue?

If you can provide that information, someone here will help you when we have time.


Also note... I've reformatted your post to make the difference between code, output from code, and your own words clearer. If you're unsure how I did that, please see https://docs.github.com/en/get-started/writing-on-github/getting-started-with-writing-and-formatting-on-github/basic-writing-and-formatting-syntax.

lostpinky commented 1 year ago

@jameslamb Thank you for your prompt response. I have provided all the requested information and look forward to your further response.

jmoralez commented 11 months ago

I believe the reason is that all models are sharing the same callback objects, which get modified in-place. You should assign each model their own callbacks, i.e. move this line:

callbacks = [log_evaluation(period=100), early_stopping(stopping_rounds=300)]

inside the loop

    lgb_model = lgb.train(params,
                          train_set,
                          valid_sets=[train_set,val_set],
                          num_boost_round=10000,
                          callbacks=[log_evaluation(period=100), early_stopping(stopping_rounds=300)]
                          )
github-actions[bot] commented 10 months ago

This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!