Thank you for providing this useful AutoML package.
In several of the main tree models (e.g. XGBoost 2.0, CatBoost) the regressors support fitting against multiple targets (see here and here). With recent advances with multi_output_tree, these multi-output regression/classification models tend to be pretty useful.
Would it be possible to modify the AutoML class to support passing a 2-d "y" targets array-like input to .fit() for regression tasks ?
I think the implementation would require:
modifying asserts / code checks to accept a 2-d multi-targets
if multi-output input is detected: assert that the model list is supported (e.g. XGBoost & CatBoost work, whereas LightGBM does not yet have built-in support for it outside of wrapping with sklearn MultiOutput Regressor). Then modify default parameters to accomodate multi-target as needed, e.g. for CatBoost regression the default objective needs to be "MultiRMSE" instead of "RMSE". These are minor changes to default parametrization, if any.
ensure splitters split y, if 2D, correctly and pass as-is to the underlying model(s)
for cross-validate scores: define a default policy of e.g. averaging scores per target to be used as final tuning validation score
Doing this abstracts away implementation details of single vs multi-output for users and allows FLAML to "just work" for either
Hi!
Thank you for providing this useful AutoML package.
In several of the main tree models (e.g. XGBoost 2.0, CatBoost) the regressors support fitting against multiple targets (see here and here). With recent advances with
multi_output_tree
, these multi-output regression/classification models tend to be pretty useful.Would it be possible to modify the AutoML class to support passing a 2-d "y" targets array-like input to
.fit()
for regression tasks ?I think the implementation would require:
Doing this abstracts away implementation details of single vs multi-output for users and allows FLAML to "just work" for either
Thanks for considering, Azmy