Ekeany / Boruta-Shap

A Tree based feature selection tool which combines both the Boruta feature selection algorithm with shapley values.
MIT License
559 stars 86 forks source link

[ENH] Randomize train/test split #97

Open MauritsDescamps opened 2 years ago

MauritsDescamps commented 2 years ago

Randomize train/test split

The random_state argument in BorutaShap.fit is set to 0 by default. This means that the train/test split performed in Check_if_chose_train_or_test_and_train_model is the same for every iteration. For the shadow features this doesn't really matter, but for the real features is means that the same subset of the data is used for training at every iteration. Is this by design, or would it be better to perform a random split each iteration?