Frequency-based Variable Dropping

Hi,

Typically, we would want to exclude the one dummy that stands for the most frequently observed category.

E.g. if we have 'small', 'medium' and 'large' while medium being the shirt size 80 percent of the population is wearing, then one typically drops the 'medium' dummy in a regression to have the regression showing the typical situation and not an outlier.

Would be handy to have a feature in place that allows dropping not just the first but the most frequent category. Should be fairly simple to achieve. But would be neat if integrated directly in the package.

Thanks for considering!

jacobkap / fastDummies

Frequency-based Variable Dropping #8