Closed jschulberg closed 2 years ago
From this post, it looks like it's actually a popular technique to One-Hot Encode (OHE) and then apply PCA. It's not as meaningful as doing PCA on continuous variables, but because OHE tends to substantially increase the dimensionality of our data, PCA helps pare down the number of variables we end up having.
PCA doesn't seem to be working too well...it may be worthwhile to use the prince
package to attempt Multiple Correspondence Analysis (MCA), which is used for multiple categorical features, instead. Or we could try Multiple Factor Analysis (MFA), which works on a combination of both continuous/categorical features.
From StackOverflow: