KlausVigo / kknn

Weighted k-Nearest Neighbors
http://klausvigo.github.io/kknn/
23 stars 10 forks source link

ARPACK error in specClust #7

Open ekernf01 opened 7 years ago

ekernf01 commented 7 years ago

For datasets with outliers, kknn::specClust runs into an igraph/ARPACK eigenvalue convergence error.

https://github.com/igraph/igraph/issues/512

On that thread, ntamas recommends removing isolated vertices from your graph, so one solution might be to remove outliers prior to clustering. But, that's not working great for me so far.

Here's an MWE:

set.seed(1)
outlier_2clust = matrix(rnorm(1000), ncol = 2)
outlier_2clust[1:250, ] = outlier_2clust[1:250, ] + 40
outlier_2clust = rbind(outlier_2clust, c(40, 0))
plot(outlier_2clust)
library(kknn)
cluster_mod = kknn::specClust(outlier_2clust, centers = 2)

Here's my sessionInfo().

R version 3.3.1 (2016-06-21)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.6 (El Capitan)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] kknn_1.3.1

loaded via a namespace (and not attached):
[1] magrittr_1.5    Matrix_1.2-8    tools_3.3.1     igraph_1.0.1    grid_3.3.1      lattice_0.20-35