scikit-learn-contrib / category_encoders

A library of sklearn compatible categorical variable encoders
http://contrib.scikit-learn.org/category_encoders/
BSD 3-Clause "New" or "Revised" License
2.4k stars 393 forks source link

Target encoding heirarchical columnwise #373

Closed nercisla closed 1 year ago

nercisla commented 1 year ago

This pull request enhances hierarchies in Target Encoders.

Author: @nercisla Current status: Work in Progress

Proposed Changes

Allows a user to submit a hierarchy within a dataframe (i.e. columnwise), not just a mapping dictionary. Columns must take the names HIER_colA_1, HIER_colA_2, HIER_colA_3, HIER_colB_1, HIER_colB_2. HIER_colC_1 etc where the last digit represents the level of hierarchy (top =1)

PaulWestenthanner commented 1 year ago

LGTM :)

PaulWestenthanner commented 1 year ago

I'll also build a release now so the changes will be ready to be pip installed