Clustering: number of PC's should be set using a better statistical criterion than
percent_variance. Specifically, we know what the eigenvalues of a given number of
samples of univariate Gaussian white noise should look like, so should use only
those that are significantly above that...