Closed Machiry closed 9 years ago
No. that line is only always hit if every point is its own medoid. I haven't tested these corner cases like k=1 or k=n. Also, if best=+infinity, it will be > 0 and thus the loop will stop. But I have little interest in PAM except for historical reasons.
As a matter of fact, running PAM with k=1 terminates for me after 1 iteration for me. But I will look into why it sometimes does seem to take forever.
The issue is much more subtle than that. I tried to literally implement the equations of Kaufman & Rousseeuw. As far as I can tell, there is an omission in the description that leads to these problems. Specifically, they only use "nonselected objects" for computing the cost. However, the element that will be deselected also needs to be reassigned to a new cluster. That is why my literal implementation ended up swapping medoids infinitely. On larger data sets, PAM will usually take an excessive amount of time in the swap phase. I suggest to always run it with a limit on the number of iterations.
It took me a long time to really figure out how to do this right, and at the same time avoid extra distance computations. It seems to be okay now.
New push to Kmedoids results in infinite loop when k=1.
When k=1, the continue statement at: https://github.com/elki-project/elki/blob/master/elki/src/main/java/de/lmu/ifi/dbs/elki/algorithm/clustering/kmeans/KMedoidsPAM.java#L200 will always be hit, thus best ending up Positive_infinity and never breaking.