ciortanmadalina / high_noise_clustering

Techniques to cluster very noisy data (dropouts or random noise)
64 stars 17 forks source link

Suggest np.linalg.eig for full symmetric matrices #1

Open bencardoen opened 5 years ago

bencardoen commented 5 years ago

Hi,

Firstly, your article and repo are excellent to explain spectral clustering with the gap heuristic. I'd like to make a suggestion w.r.t. you eigenvalue computation, the scipy sparse eigsh code will warn (once) when you call it with a symmetric matrix (non sparse) and k=N, in the warning it falls back to the eigh call. The confusing part is that for 100 function calls, it'll emit the warning once, so is easily missed. May I suggest substituting this to np.linalg.eigh for full matrices? Note that setting K < N affects the heuristic, whereas the full eigenvalues does not. I'm referring to this notebook: https://github.com/ciortanmadalina/high_noise_clustering/raw/master/spectral_clustering.ipynb

Thanks!

ciortanmadalina commented 5 years ago

Hello, thank you for your interest and your suggestion. I will look into it and integrate it as soon as I get some time! Have a nice day!