Using best round when applying early stopping

ChristianMichelsen commented 3 years ago

Hi and thanks for a really interesting library!

I have two small questions.

First of all I was thinking about the reasoning behind your choice of XGBoost instead of LightGBM? I am just curious if XGB has any inherent advantages for survival analysis compared to LGB.

Then on to my main question related to early stopping. As far as I can see, when using early stopping, you are not using the best iteration for predictions, but rather the last iteration. Or, in code; instead of:

y_pred = bst.predict(dtrain)

it should be:

y_pred = bst.predict(dtrain, ntree_limit=bst.best_ntree_limit)

Please correct me if I am mistaken since I have not used your library extensively yet :)

Cheers,

GabrielGimenez commented 3 years ago

Hi Christian, thanks for your report! It's indeed a bug, we'll look into it and provide a fix soon.

About the reasoning behind XGBoost, even though LightGBM has the advantage of using categorical features directly we went for XGBoost because it has better support for survival problems via survival:cox and survival:aft objetive functions.

ChristianMichelsen commented 3 years ago

Wow, really quick reply, nice! Cool, glad to see that it wasn't just me who couldn't find it and that it'll be fixed soon.

Ah, that makes sense. Thanks!

loft-br / xgboost-survival-embeddings

Using best round when applying early stopping #26