feature-engine / feature_engine

Feature engineering package with sklearn like functionality
https://feature-engine.trainindata.com/
BSD 3-Clause "New" or "Revised" License
1.8k stars 303 forks source link

OHE: User can select which categories to encoded for selected variables #667

Open Morgan-Sell opened 1 year ago

Morgan-Sell commented 1 year ago

Closes #303.

Objective: Create functionality so a user can encode certain categories that may not be the most frequent. The functionality is explained in this thread.

Will create a new init param called custom_categories that accepts a dictionary with selected features as keys and lists of the desired categories as values.

Both top_categories and user_categories cannot be used at the same time.

Morgan-Sell commented 1 year ago

hi @solegalli,

I think we're getting close on this one.

Are there any other unit tests that we should implement? According to the coverage report, OHE has 100% coverage.

I'll dig into the type error.

Morgan-Sell commented 1 year ago

Disregard my comment about code coverage. It seems that the transformer is very poorly "covered". I'll work on this.

Morgan-Sell commented 1 year ago

Hi @solegalli,

When you're back from vacay, lmk what you think.

Also, I don't know why I said that OHE has a low code coverage score. I just ran the coverage report - the transformer received 100% coverage!

solegalli commented 1 year ago

Hey @Morgan-Sell

I disappeared for a while, so I am a bit lost. I guess you were waiting for my input here, no? You let me know when I need to take another look? I am away from next Thursday, for a month, ehem, (red face)