scikit-learn-contrib / category_encoders

A library of sklearn compatible categorical variable encoders
http://contrib.scikit-learn.org/category_encoders/
BSD 3-Clause "New" or "Revised" License
2.41k stars 396 forks source link

Target encoding heirarchical columnwise #373

Closed nercisla closed 2 years ago

nercisla commented 2 years ago

This pull request enhances hierarchies in Target Encoders.

Author: @nercisla Current status: Work in Progress

Proposed Changes

Allows a user to submit a hierarchy within a dataframe (i.e. columnwise), not just a mapping dictionary. Columns must take the names HIER_colA_1, HIER_colA_2, HIER_colA_3, HIER_colB_1, HIER_colB_2. HIER_colC_1 etc where the last digit represents the level of hierarchy (top =1)

PaulWestenthanner commented 2 years ago

LGTM :)

PaulWestenthanner commented 2 years ago

I'll also build a release now so the changes will be ready to be pip installed