scikit-learn-contrib / boruta_py

Python implementations of the Boruta all-relevant feature selection method.
BSD 3-Clause "New" or "Revised" License
1.46k stars 252 forks source link

What to do when Boruta seems to be rejecting more features than it should? #55

Closed felipheggaliza closed 5 years ago

felipheggaliza commented 5 years ago

Hi,

I am trying to play with Boruta and sometimes I see that this algorithm is rejecting more features than it should. What parameters can I tune either in Boruta Algorithm or Random Forest in order to have it working properly?

danielhomola commented 5 years ago

more features than it should

How do you know what it should do? Seems like you have strong priors when looking at your data.

to have it working properly?

Boruta is a published FS method, used by thousands of people in different fields and competitions. It is working properly, as in, it's doing what it says on the tin (have you read the paper?)

None of your comment is quantitative, benchmarked or scientifically helpful in any sense, so I'll recommend you read the paper, or at least the readme (then you'd know, adjusting the p-value cut-off helps).

Closing the thread.