Closed HannesHolste closed 6 years ago
@antgonza: that was very insightful, thank you. I configured eigsh to use shift-inter mode and re-ran the failed PCoA benchmarks successfully, with no convergence error.
But just in case, I added some extra code to deal with convergence errors in this PR: it will impute missing eigenvals and vecs as NaN. Unit test added too.
Please review and merge if OK.
Problem: running pcoa using eigsh (reducing to 3 dimensions) on a subsampled matrix of a randomly generated distance matrix may or may not throw an exception because not all eigenvectors converge. So far this issue only occurred on one dataset – the subsampled, randomly generated one – but the new benchmarks are still running on the cluster so there may be more. (the problem matrix is one that was generated through skbio randdm function, original dimension 4096, subsampled to 3072 dimensions)
Since eigsh is an unlikely candidate for the final pcoa method we choose, for now, I decided to simply catch the exception and move on with benchmarks, but the results may not be 'correct', i.e. eigenvectors may be missing.
What are your thoughts @wasade @antgonza ?
Some related discussions: