InseadDataAnalytics / INSEADAnalytics

Other
122 stars 1.31k forks source link

Lumping "Other" and "Unknown" #121

Open kajajasik opened 6 years ago

kajajasik commented 6 years ago

What is the threshold for lumping the data from rare cathegories ? E.g if 2 of my rare cathegories are more than X% of the data set they should not be lumped together...