Closed cavery12 closed 4 years ago
The hPCA eigenvalues are derived from diagonalizing the generalized coordinates, so the variance of the entire set of variables for each residue is now in the reduced set of modes, and then 'stacked', possibly greatly inflating their magnitudes.
After doing quite a bit of testing, I have discovered an interesting situation:
Given an all atom subset used for both Cartesian and Hierarchical PCA analysis, it sometimes happens that the RANK of the Cartesian covariance matrix is FULL, but the RANK of the Hierarchical covariance matrix is NOT FULL... it is usually nearly so, but short by 1 to 5...
It is exactly this reason why the two methods do not always yield identical results with a single residue, as they should. In these situations of less than full rank, the Hierarchical eigenvector matrix is not an identity matrix, but very NEAR the identity, except for those last 1 to 5 modes.
So, I have now identified the REASON for the slight inconsistencies, but not the CAUSE. I would be grateful for any ideas on what is causing this loss of rank in the hierarchical covariance matrix.
All discrepancies now accounted for by known code.
The top Eigenvalues for Hierarchical PCA are very large in comparison with the other types in JEDi. Code is being checked for this issue.