scikit-learn-contrib / boruta_py

Python implementations of the Boruta all-relevant feature selection method.
BSD 3-Clause "New" or "Revised" License
1.46k stars 252 forks source link

Implements sample_weight and optional permutation and SHAP importance, categorical features, boxplot #100

Open ThomasBury opened 2 years ago

ThomasBury commented 2 years ago

Hi,

It took me a while but finally found the time to work on the continuation of the discussion https://github.com/scikit-learn-contrib/boruta_py/pull/77

Meaning:

danielhomola commented 2 years ago

Hi Thomas,

Thanks for this! Will try to find time in the next few weeks to go through it (it's quite a lot). To start with however, can we make sure that no .idea and .ipython-checkpoints file are committed? Thanks!

ThomasBury commented 2 years ago

Hi Thomas,

Thanks for this! Will try to find time in the next few weeks to go through it (it's quite a lot). To start with however, can we make sure that no .idea and .ipython-checkpoints file are committed? Thanks!

Sorry, they were remains of the previous ignore, .idea and checkpoints are removed. I hope the notebook will be helpful, do not hesitate if you have any questions/remarks.

Thanks

erikvdp commented 2 years ago

This seems like a really cool PR! Is there any chance that it will get merged soon?

ThomasBury commented 2 years ago

This seems like a really cool PR! Is there any chance that it will get merged soon?

Thanks @erikvdp, meanwhile, you might have a look at https://github.com/ThomasBury/arfs implementing those and more (although I still think it'd best to integrate the changes related to boruta in the official boruta_py ^^)

MauritsDescamps commented 1 year ago

Any chance this will be merged? Would really like to try out Boruta with Shap feature importance

ThomasBury commented 1 year ago

Any chance this will be merged? Would really like to try out Boruta with Shap feature importance

Hi @MauritsDescamps, I built the ARFS package to provide those features for Boruta (and much more). In the ARFS pkg, you'll find 3 different methods for performing all relevant feature selection. I called the evolution of Boruta: "Leshy" and it provides the features of this PR. There are notebooks that explain step by step how to use it and what are the differences.

you can test it by simply pip install -U arfs, there is a brand new release (version 1.0.2)