Open dannycg1996 opened 5 days ago
Hi, I will check through with this in the future but check #1275 discussion as well, it seems that they have come across the same issue... I will try and see through with what the issue is :) If anyone else can contribute of help out please do, thnx
Hi all,
The n_estimators value on the best model (
automl.model
) provided by FLAML does not seem to be set correctly for CatBoostClassifiers.Example code here:
The print statement logs the following for me: {'early_stopping_rounds': 10, 'learning_rate': 0.09999999999999996, 'n_estimators': 33, 'thread_count': -1, 'verbose': False, 'random_seed': 10242048, 'task': <flaml.automl.task.generic_task.GenericTask object at 0x7f895f2b3830>, '_estimator_type': 'classifier'}
However, if I look into the actual [catboost_error.log], I can see that neither of the two estimators attempted had n_estimators = 33. They actually had n_estimators = 35 and n_estimators =57. Replicating the FLAML folds myself has shown that this n_estimators value should be 35, meaning that the logs are correct and automl.model is incorrect.
Furthermore, if I run
print(automl.model.model.get_all_params())
I get a dictionary which includes iterations=35. The catboost documentation shows that iterations is an alias of n_estimators, and whilst I haven't managed to pin down the exact cause of this issue, I believe it's tied in somewhere here.In terms of package versions, I'm using FLAML 2.1.2, catboost 1.2.5, scikit-learn 1.5.0 and Python 3.12.0