microsoft / FLAML

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
https://microsoft.github.io/FLAML/
MIT License
3.94k stars 513 forks source link

Feature Selection by FLAML? #258

Open knoam opened 3 years ago

knoam commented 3 years ago

Could you also use FLAML to select an optimal subset of features, perhaps using fewer features at first, then increasing, similar to how model complexity increases during training?

qingyun-wu commented 3 years ago

Hi @knoam, thank you for your question and that's an interesting idea. Presumably, you can do so by creating a customized learner: 1. add the number of features to use as a hyperparameter in the search space; 2. before the actual training, first do feature selection according to the number of features suggested by FLAML to get the feature used for training. One underlying assumption of this approach is that your features are ordered by importance or the order does not matter that much such that we can make a decision based on the number of features. Let me know what do you think!

Thank you!

jw00000 commented 3 years ago

From what I gleaned from playing with autosklearn, their approach was to build their search space to search for the best 'pipeline' where a pipeline included some preprocessing steps as well as the estimator. So included in the search space were hyperparameters defining choices about which preprocessing components to use and with what hyperparameters. Among the choices of preprocessing steps were many sklearn transformers including feature selection transformers. As a result the search space is rather large I think. I'm curious what you think about this approach. Would it make the search space too large to be practical? Do you think it would improve the quality of the models?

qingyun-wu commented 3 years ago

Hi @jw00000, thank you for sharing your experience with autosklearn and suggestions. Including the preprocessing component into the search space (as a hyperparameter with categorical choices) is a very reasonable approach, especially when the number of preprocessing choices is not that large, e.,g., it should be still practical when the number is less than 5. Regarding the impact on model quality: if the time/resource budget is abundant, presumably the model quality won't become worse, although in the case where the time/resource budget is small, the quality of the resulting model may be degraded. We haven't tried this yet. Do you want to give it a try? We'd like to know how it works if so. Thank you!

sonichi commented 2 years ago

@knoam what is a metric you'd like to optimize when doing feature selection? With what constraint?