ray-project / xgboost_ray

Distributed XGBoost on Ray
Apache License 2.0
133 stars 34 forks source link

Early stopping best practice and examples #301

Closed daviddwlee84 closed 7 months ago

daviddwlee84 commented 7 months ago

Hi, I've been looking around the Ray documents for a while, but not sure if XGBoost Ray can do or how to do early stopping based on the evaluation set metric score (either maximize or minimize).

Directly passing early stopping callbacks to XGBoost is not reasonable for the distributed training scenario without relevant measures I think.

Seems it is more likely to do with this? Tune Stopping Mechanisms (tune.stopper) — Ray 2.8.0 [Train] Support for early stopping · Issue #21848 · ray-project/ray How to Define Stopping Criteria for a Ray Tune Experiment — Ray 2.8.0

This seems kind of outdated Tuning xgboost with early stopping - Ray Libraries (Data, Train, Tune, Serve) / Ray Tune - Ray

Maybe a brute force way is to do manual evaluation + incremental training?

daviddwlee84 commented 7 months ago

After some experiments. Seems passing XGBoost built-in parameters can do the job. Even though in the imagination some workers might quit and cause some communication error. (https://github.com/microsoft/LightGBM/issues/6197) But somehow in the Ray framework, it works.

https://github.com/ray-project/xgboost_ray/blob/9081780c5826194b780fdad4dbe6872470527cab/xgboost_ray/main.py#L738

ray.train.xgboost.XGBoostTrainer — Ray 2.8.0

To be specific, listed some related arguments