Can I somehow speed the Borutapy process

VEZcoding commented 1 year ago

Hello is there anyway to parallelize the iterations? Or some parameter like n-jobs or something. its super slow with 3k features

Wuuzzaa commented 1 year ago

Hi it is not possible to parallelize the iterations because each iteration needs the results of the previous. You can provide an estimator with n_jobs=-1 to boruta if you want. Too speed up you can try early_stopping=True or reduce the n_estimators. Maybe it is a good idea to prefilter your features first with simpler stuff like duplicate or constant features.

How much samples do you have. If they are huge use a subsample like 10k. I am sure it will make no huge diffrence to run boruta on 10k or 1 mio. samples regarding to the selected features.

VEZcoding commented 1 year ago

Thanks for you answer, Will check this early stopping :)

scikit-learn-contrib / boruta_py

Can I somehow speed the Borutapy process #108