Open nitinmnsn opened 1 year ago
Hi @nitinmnsn, thanks for the reproducible example! I was able to take a look at this tonight and I see that it's because the individual boosters are reaching this point https://github.com/microsoft/LightGBM/blob/216eaff723e11a84b27ae4275675a46e8c7326ba/src/boosting/gbdt.cpp#L424-L432
The warnings aren't printed in your example because of the 'verbose': -1
in the dataset params, if you remove it you can see them print non-stop.
Just wanted to share my findings so far in case someone wants to pick it up.
Why the cvbooster continues in that case? What does it even mean for the cvbooster to continue in that case? I knew about what you have linked but I thought that since there are no more splits and the individual boosters aren't training then that's where early stopping should hit. If the individual boosters aren't training then how did the cvbooster reach 2009 iterations?
And then there's also the second anomaly (I think) I have highlighted. Training the cv booster with early stopping that takes it to 2009th iteration vs training with 2009 estimators explicitly specified. The individual booster rounds are different as I've shown
Description
Running
lightgbm.cv
with alightgbm.callback.early_stopping(50, False)
results in a cvbooster whosebest_iteration
is2009
whereas thecurrent_iterations()
for the individual boosters in the cvbooster are[1087, 1231, 1191, 1047, 1225]
. My understanding is that thebest_iteration
of cvbooster should be exactlymax(current_iterations) - early_stopping_rounds
.Reproducible example
Import dependencies:
Create dummy data
Creating the training dataset
Setting training hyperparameters that would reproduce the behavior
Create an early stopping callback
Run
lightgbm.cv
Check the output
output:
Also, if I run the
lightgbm.cv
for the exact2009
(the best iteration from the cv run with early stopping) without using early stopping it sometimes gives a different number ofcurrent_iteration
for the boosters. In this particular case if we runThen check the number of individual current iterations
output:
Environment info
lightgbm - 3.3.5 installed with
pip install lightgbm
pd.version - 1.5.2 np.version - 1.23.5 sklearn.version - 1.2.1Additional Comments