I noticed that there's a lot of constant zero features in the data, and I was wondering why.
Turns out that if a categorical feature is imbalanced enough, all the minorities are outliers, so the outlier removal makes the feature constant 0. That's probably not ideal.
I noticed that there's a lot of constant zero features in the data, and I was wondering why. Turns out that if a categorical feature is imbalanced enough, all the minorities are outliers, so the outlier removal makes the feature constant 0. That's probably not ideal.