Closed bora-pajo closed 7 years ago
It really depends, two features may correlate with rate .90 but remaining .10 might have a huge impact on the prediction. On the other hand, it might be garbage. The solution here isn't always throwing a dimension reduction algorithm ( such as PCA ) sometimes it is just flat out removing a feature. We try different approaches and see what is best for our model and move accordingly.
Thank you @utkuozbulak !
I am wondering whether as a rule of thumb we should remove variables that are highly correlated to each other (say over .80) from any classification analysis. Is that correct or PCA takes care of it and we do not need to even check for correlation? Thank you in advance for your answers