feature-engine / feature_engine

Feature engineering package with sklearn like functionality
https://feature-engine.trainindata.com/
BSD 3-Clause "New" or "Revised" License
1.88k stars 310 forks source link

DropCorrelatedFeatures and SmartCorrelatedSelection consistency changes #633

Closed glevv closed 1 year ago

glevv commented 1 year ago

Fixes #612

Will also close #619

codecov[bot] commented 1 year ago

Codecov Report

Merging #633 (9ad37aa) into main (ab91403) will increase coverage by 0.12%. The diff coverage is 100.00%.

:exclamation: Current head 9ad37aa differs from pull request most recent head 184525f. Consider uploading reports for the commit 184525f to get more accurate results

@@            Coverage Diff             @@
##             main     #633      +/-   ##
==========================================
+ Coverage   97.90%   98.03%   +0.12%     
==========================================
  Files          98       98              
  Lines        3588     3620      +32     
  Branches      695      707      +12     
==========================================
+ Hits         3513     3549      +36     
+ Misses         28       26       -2     
+ Partials       47       45       -2     
Impacted Files Coverage Δ
...ature_engine/selection/drop_correlated_features.py 95.45% <100.00%> (+5.45%) :arrow_up:
...re_engine/selection/smart_correlation_selection.py 100.00% <100.00%> (+2.35%) :arrow_up:

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

glevv commented 1 year ago

Decided to keep only {None, 'nan', 'unique', 'alphabetic'}, since std could be confusing with different scales of features (not standardized) while cv could fix this problem, but will add another if mean is equal (or close) to 0.

solegalli commented 1 year ago

We still need to extend this functionality to the smart correlation selector to merge

solegalli commented 1 year ago

See #648