rasbt / mlxtend

A library of extension and helper modules for Python's data analysis and machine learning libraries.
https://rasbt.github.io/mlxtend/
Other
4.83k stars 855 forks source link

Add `group_features` to feature selection #956

Closed NimaSarajpoor closed 1 year ago

NimaSarajpoor commented 2 years ago

Describe the workflow you want to enable

The idea is initially discussed in #954. The idea is to force some features to be considered together throughout the feature selection process. This can be useful if a categorical feature is going to be transformed into a considerable number of new one-hot-encoded features. One advantage is that we can decrease the computing time. (We might be able to narrow down our search space first and then the drop the unselected ones, and then perform another feature selection process on the remaining ones without forcing the features to be grouped together on the second stage of process!)

Describe your proposed solution

As suggested by @rasbt, we can do something similar to https://github.com/rasbt/mlxtend/blob/master/mlxtend/evaluate/feature_importance.py#L116

The idea is to add a new parameter that indicates what features should be together.

Describe alternatives you've considered, if relevant

N/A

Additional context

I already implemented this new feature for mlxtend/feature_selection/exhaustive_feature_selector.py. I am going to submit a PR. I need some help to see if the choices I made are correct, and how to test the new feature (I should take a look at the test function)