microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
14k stars 1.81k forks source link

Enhancement of GBDTSelector inherited from FeatureSelector #5719

Open linjing-lab opened 10 months ago

linjing-lab commented 10 months ago

What would you like to be added: The FeatureSelector class was written in a preliminary form, like the following referenced code snippet: https://github.com/microsoft/nni/blob/767ed7f22e1e588ce76cbbecb6c6a4a76a309805/nni/feature_engineering/feature_selector.py#L26-L34 I think GBDTSelector is not necessary to inherit FeatureSelector while it rewrite all class methods, though it doesn't inherit the class properties of FeatureSelector. https://github.com/microsoft/nni/blob/767ed7f22e1e588ce76cbbecb6c6a4a76a309805/nni/algorithms/feature_engineering/gbdt_selector/gbdt_selector.py#L35 and GBDTSelector class adopts train_test_split function from scikit-learn, I was wondering if the validation datasets will be needed to enhance the effect of fit module (like what I implemented for training cycle while evaluate validation to early stop): https://github.com/microsoft/nni/blob/767ed7f22e1e588ce76cbbecb6c6a4a76a309805/nni/algorithms/feature_engineering/gbdt_selector/gbdt_selector.py#L86-L89

Why is this needed: The subclass that inherits from FeatureSelector is GBDTSelector, and it is not a valid inheritance because all subclass properties and methods are overridden.

Without this feature, how does current nni work: GBDTSelector works in nni that powered by fit module and get_selected_features module, not obtained by the methods from FeatureSelector class.

Components that may involve changes: Add super(GBDTSelector, self).__init__ to initial part of GBDTSelector, or drop FeatureSelector class if inherited properties wasn't important when subclass rewrited all the methods instead of giving some changes.

Brief description of your proposal if any: The properties and methods contained in FeatureSelector may not contribute to the code logic of any module in GBDTSelector.