Often data matrices must be centered prior to PCA analysis. When a gene expression matrix is considered, the intensities for an individual gene should have zero mean and unit variance. I think one should use the population standard deviation instead of the sample standard deviation:
Note: The result does NOT change, just the scaling is a little different. But for the sake of completeness I suggest to update it. What do you think @ZimmerD
Often data matrices must be centered prior to PCA analysis. When a gene expression matrix is considered, the intensities for an individual gene should have zero mean and unit variance. I think one should use the population standard deviation instead of the sample standard deviation:
https://github.com/fslaborg/FSharp.Stats/blob/3aa4c4ce5768e6e1e49d45efd6d2de5e1562e319/src/FSharp.Stats/ML/Unsupervised/PrincipalComponentAnalysis.fs#L41
Note: The result does NOT change, just the scaling is a little different. But for the sake of completeness I suggest to update it. What do you think @ZimmerD