kno10 / rust-kmedoids

k-Medoids clustering in Rust with the FasterPAM algorithm
GNU General Public License v3.0
20 stars 3 forks source link

Does k-medoid requires metric distance #8

Closed jianshu93 closed 4 months ago

jianshu93 commented 4 months ago

Hello kmedoid team,

Just a quick question on whether k-medoid also requires a metric distance since k-mean requires this to effectively select new centers and ensure convergency. My understanding is yes, especially the triangular inequality for select/update centers effectively.

Thanks,

Jianshu

kno10 commented 4 months ago

No, k-medoids does not use metric properties such as triangular inequality anywhere. You can implement it to maximize a sum of similarities, or you can allow negative distances, too. You can also extend this to multi-criteria optimization by weighting.

There exists a variant of the "alternating" algorithm (k-means style alternating optimization) that tries to exploit metric properties, we have not implemented this. In theory, this is subquadratic, but we found the alternating optimization to frequently yield worse results.

K-means optimizes sum-of-squares, which is not a metric. It does not minimize the sum of Euclidean distances. But there are several acceleration techniques that use the triangle inequality to bound Euclidean distances (and hence sum-of-squares) for acceleration.