Write a k-medians clustering algorithm

amnh / PCG

𝙋𝙝𝙮𝙡𝙤𝙜𝙚𝙣𝙚𝙩𝙞𝙘 𝘾𝙤𝙢𝙥𝙤𝙣𝙚𝙣𝙩 𝙂𝙧𝙖𝙥𝙝 ⸺ Haskell program and libraries for general phylogenetic graph search

28 stars 1 forks source link

Currently our clustering pass uses the algorithms in AI.Clustering.Hierarchical. As it stands there is no easy way to make use of the k-means algorithm within this library.

We should write a k-medians algorithm which is linear time and has good convergence properties (it should also expose a good amount of parallelism). This will entail writing a k-medians function for each of our different character types i.e. functions that looks something like:

kMedian :: (Foldable t, Traversable t) => t CharacterType -> CharacterType

In the case of a discrete character this will function will look like an inclusion-exclusion formula and we should be able to write similar things for other character types.

This would then be added as a Clustering option (currently asking for median clustering gives an error explaining that this is not yet implemented).

amnh / PCG

Write a k-medians clustering algorithm #142