Closed yxiang001 closed 2 years ago
I set the n_components to 600 and run the model again. It's fast and this PCA model with n_components = 600 explains about 98.2% variance of the training dataset. After that, I pick cumsum variance of more than 95% variance. Finally the n_components = 195 for my training set
Our dataset has 2002003 features, It takes a very long time to run the PCA model for feature reduction. I am trying to running the PCA for no setting of the component's number and plot the cumsum explained variance ratio to pick the best component number for PCA. But it has been running for 15 mins already. The model still running and didn't get the result.