Closed aarmey closed 2 years ago
We actually can't use statsmodels.multivariate.pca
because it can't handle large datasets. I wrote a function that uses exactly the same EM scheme with TruncatedSVD
, and it should be able to handle missing values now.
Excellent! Thanks.
statsmodels.multivariate.pca
uses EM to properly solve for the PCA results in the presence of missing values.https://github.com/meyer-lab/tensorpack/blob/fb467a1fc30040954872aa6fb2cd659d3122cd7e/tensorpack/decomposition.py#L28