feature-engine / feature_engine

Feature engineering package with sklearn like functionality
https://feature-engine.trainindata.com/
BSD 3-Clause "New" or "Revised" License
1.8k stars 304 forks source link

add test to ensure estimators raise errors when X has duplicated column names #687

Open david-cortes opened 11 months ago

david-cortes commented 11 months ago

Per comment in PR: https://github.com/feature-engine/feature_engine/pull/686#pullrequestreview-1544746361

After the PR in the link above gets merged, there will be a general check for duplicated columns for which a test was added in test_dataframe_checks.py. This new error raised on duplicated column names should be tested also in the invidual transformers in estimator_checks/* after the files gets refactored.

solegalli commented 11 months ago

Thank you @david-cortes

Morgan-Sell commented 9 months ago

hi @solegalli and @david-cortes,

Slowly dipping my toes back into feature-engine.

Is the idea to create a test/check function on dataframe_checks.py and then add the check functions to the parent transformers in the fit() method?

solegalli commented 9 months ago

No, I think the idea was to add a generic test to estimator_checks to test all transformers to ensure that they inherit this functionality properly. I am not sure if it is an overkill or not.

Morgan-Sell commented 9 months ago

Ok, so as the code base stands, feature-engine transformers raise errors when there are duplicative feature names, correct? This issue is to create a test that the error is being raised when expected.

solegalli commented 9 months ago

Correct!