InseadDataAnalytics / INSEADAnalytics

Other
122 stars 1.31k forks source link

Dimensionality Reduction in data with few variables #151

Open sanjayc28 opened 6 years ago

sanjayc28 commented 6 years ago

For our project, we are working on a dataset with relatively fewer variables (<10). I would imagine that dimensionality reduction, in theory, may have some value but it may also leave us with a handful 2-3 variables (components) to work with. Is there a tradeoff here? Meaning - should dimensionality reduction be only used if you have a large number of variables to begin with? Or could it still be useful in the case of 5 or 10 variables? Thanks!

samjameswaller commented 6 years ago

With 10 variables it could still be highly useful - especially if the explanatory power is predominantly in a few of the variables. e.g. is 75% of the variance in the values is explained by 5 variables, then it could we worth cutting them down to simplify the analysis and ensure you done get any over fitting.