Open BradKML opened 2 years ago
For some descriptions:
cv=4
)RandomForest
model to make the decision.LGBMRegressor
internal function in LightGBM
, simple enoughHi @BrandonKMLee -- Thanks for putting together this list and the descriptions. We'd be open to PRs that implement these alternative algorithms. Our core team is pretty focused on improving EBMs, so we don't have a lot of bandwidth to work on more tangential improvements.
@paulbkoch noted with thanks regarding the priority, and I also remember how booster-based feature selection was being heavily focused on by everyone https://github.com/scikit-learn-contrib/boruta_py https://github.com/Ekeany/Boruta-Shap https://github.com/chasedehan/BoostARoota Also some other small finds regarding MRMR (mutual information, not sure if it overlaps to other methods here) https://github.com/AutoViML/featurewiz https://github.com/smazzanti/mrmr https://github.com/danielhomola/mifs
P.S. There are other super-repos for feature importance https://github.com/JingweiToo/Wrapper-Feature-Selection-Toolbox https://github.com/jundongl/scikit-feature
TL;DR Original List with yet-to-be implemented FE algorithms in https://github.com/parrt/random-forest-importances/issues/54
Seeing https://github.com/interpretml/interpret/issues/364 and https://github.com/interpretml/interpret/issues/218 I do notice that some of the feature importance algorithms are not on the list, particularly LOFO, Morris and "Unbiased" feature importance. Might wanna check those out?
Bonus: this visualization notebook exists https://github.com/shionhonda/feature-importance
Currently these are not in the ReadME:
Unsure if they have alternative name:
LIME being similar to SHAP https://github.com/marcotcr/limeHow are they different (missing data test vs prioritization, significant correlation)?