kno10 / rust-kmedoids

k-Medoids clustering in Rust with the FasterPAM algorithm
GNU General Public License v3.0
20 stars 3 forks source link

Add CLARANS and FastCLARANS #6

Open kno10 opened 9 months ago

kno10 commented 9 months ago

As these methods require distance functions and a data matrix, it makes sense to first implement CLARA #5

CLARANS modifies PAM with a random sampling follows:

hence it approximates PAM.

We can easily (c.f., the FastPAM papers) integrate the FastPAM optimizations:

FastCLARANS:

It makes less sense to also integrate the "Eager" optimizations - eventually we can just run FasterPAM on shuffled input with a limit on the number of distance computations to achieve a similar approximation. Hence I do not believe CLARANS and FastCLARANS are of much practical relevance anymore.

The FastCLARANS reference implementation in Java is in ELKI: FastCLARANS but this contains data structures optimized for Java that we can do differently in Rust without the same overhead.