Closed Khaled-Issa closed 1 year ago
Hi @Khaled-Issa
thanks for your contribution. This encoder to me seems to do pretty much the same as the target encoder except that target encoders has regularization in order to avoid over-fitting in case categories are small. So I'm wondering if there is an added-value in the response coding? Do you have any academic reference or a good argument why this can be better (in some situations) than target encoding?
Hi @PaulWestenthanner,
I got the idea of response coding from this medium article: https://medium.com/@thewingedwolf.winterfell/response-coding-for-categorical-data-7bb8916c6dc1
I agree with you that it acts like the target encoding, the only difference is it calculates the probabilities not just for the label =1 but for the label =0 too and adds two columns instead of one to the dataframe. one for the probs = 1, the other =0.
Do you think it'll be valuable to add a parameter in the target encoding that if set to true, adds the probs=0 too to the encoded dataframe or that will be redundant?
Fixes #
Proposed Changes
I added a new encoding method to thet set of categorical encoders that are already there. The method is the response coding.