microsoft / FLAML

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
https://microsoft.github.io/FLAML/
MIT License
3.76k stars 495 forks source link

support xgboost 2.0 #1219

Closed sonichi closed 9 months ago

sonichi commented 9 months ago

Why are these changes needed?

Trying to support xgboost 2.0

Related issue number

close #1217

Checks

thinkall commented 9 months ago

https://github.com/microsoft/FLAML/actions/runs/6180225370/job/16776350218?pr=1219#step:13:944

FAILED test/spark/test_multiclass.py::TestMultiClass::test_sparse_matrix_classification - AttributeError: 'XGBClassifier' object has no attribute 'use_label_encoder'
sonichi commented 9 months ago

https://github.com/microsoft/FLAML/actions/runs/6180225370/job/16776350218?pr=1219#step:13:944

FAILED test/spark/test_multiclass.py::TestMultiClass::test_sparse_matrix_classification - AttributeError: 'XGBClassifier' object has no attribute 'use_label_encoder'

that test doesn't use xgboost 2. It uses xgboost 1.7. Could you check why it fails with the spark test?

thinkall commented 9 months ago

https://github.com/microsoft/FLAML/actions/runs/6180225370/job/16776350218?pr=1219#step:13:944

FAILED test/spark/test_multiclass.py::TestMultiClass::test_sparse_matrix_classification - AttributeError: 'XGBClassifier' object has no attribute 'use_label_encoder'

that test doesn't use xgboost 2. It uses xgboost 1.7. Could you check why it fails with the spark test?

what do you mean with "it uses xgboost 1.7"? This test can pass with xgboost 1.7.0. In spark tests, it only parallize the trials with spark.

sonichi commented 9 months ago

https://github.com/microsoft/FLAML/actions/runs/6180225370/job/16776350218?pr=1219#step:13:944

FAILED test/spark/test_multiclass.py::TestMultiClass::test_sparse_matrix_classification - AttributeError: 'XGBClassifier' object has no attribute 'use_label_encoder'

that test doesn't use xgboost 2. It uses xgboost 1.7. Could you check why it fails with the spark test?

what do you mean with "it uses xgboost 1.7"? This test can pass with xgboost 1.7.0. In spark tests, it only parallize the trials with spark.

xgboost 1.7 is installed for this test: https://github.com/microsoft/FLAML/actions/runs/6180225370/job/16776350218?pr=1219#step:8:89

levscaut commented 9 months ago

hello @sonichi , I fixed small deprecated import error in notebook. May this help.

sonichi commented 9 months ago

hello @sonichi , I fixed small deprecated import error in notebook. May this help.

Thanks. The test still fails.

levscaut commented 9 months ago

hello @sonichi , I fixed small deprecated import error in notebook. May this help.

Thanks. The test still fails.

I'm still working on this test. According to my local test, this issue is most likely due to the inconsistency of xgboost version between pyspark driver and executor. I'm figuring out why the install xgb<2 part is not happened on executor.

sonichi commented 9 months ago

hello @sonichi , I fixed small deprecated import error in notebook. May this help.

Thanks. The test still fails.

I'm still working on this test. According to my local test, this issue is most likely due to the inconsistency of xgboost version between pyspark driver and executor. I'm figuring out why the install xgb<2 part is not happened on executor.

If the executor uses a version <1.7, then install xgb<2 would not change it. "use_label_encoder" is triggered for version <1.7 only.

levscaut commented 9 months ago

hello @sonichi , I fixed small deprecated import error in notebook. May this help.

Thanks. The test still fails.

I'm still working on this test. According to my local test, this issue is most likely due to the inconsistency of xgboost version between pyspark driver and executor. I'm figuring out why the install xgb<2 part is not happened on executor.

If the executor uses a version <1.7, then install xgb<2 would not change it. "use_label_encoder" is triggered for version <1.7 only.

I think the executor is using 1.7, while driver using 2.0. When calling repr() of XGBModel, driver require the attribute use_label_encoder but the high version(2.0) model object does not have this attribute. Actually, a simple fix is to remove the install xgb<2 section. This will fix the failure, but make flaml not compatible with xgb<2, so I'll try to figure out other solution.

thinkall commented 9 months ago

Very weired, it passed when test only test/spark. https://github.com/microsoft/FLAML/actions/runs/6258815880/job/16993596238?pr=1219#step:13:24