feature-engine / feature_engine

Feature engineering package with sklearn like functionality
https://feature-engine.trainindata.com/
BSD 3-Clause "New" or "Revised" License
1.8k stars 303 forks source link

allow sample weight in shuffle features #662

Closed solegalli closed 1 year ago

solegalli commented 1 year ago

closes #654

solegalli commented 1 year ago

Hey @markdregan

Here a PR to expand the fit method of Shuffle features to accommodate sample weights.

This is training the model with sample weights. The predictions done after shuffling are done without weighting. This is, based on my understanding, the correct procedure. Correct me if I am wrong.

Thank you!

solegalli commented 1 year ago

Looking at the cross_validate logic, the performance on test set is calculated without sample_weights:

https://github.com/scikit-learn/scikit-learn/blob/364c77e047ca08a95862becf40a04fe9d4cd2c98/sklearn/model_selection/_validation.py#L707-L711

Sample weights is used only to train:

https://github.com/scikit-learn/scikit-learn/blob/364c77e047ca08a95862becf40a04fe9d4cd2c98/sklearn/model_selection/_validation.py#L682-L686

markdregan commented 1 year ago

Thank you! I will test this week. 🙌

markdregan commented 1 year ago

TY. Confirmed working correctly. Thanks for the fast PR!