genbattle / dkm

A generic C++11 k-means clustering implementation
MIT License
209 stars 47 forks source link

Various commonly needed utility functions #2

Closed eozd closed 7 years ago

eozd commented 7 years ago

This PR is about implementing commonly needed utility functions.

Fixes #1 .

Proposed Changes

  1. A separate distance function under dkm::details
  2. Following utility functions under namespace dkm
    • dist_to_center: Distance of each point to its cluster center
    • sum_dist: Sum of distances of each point to its cluster center
    • get_cluster: Get points that belong to the cluster with the given label
    • means_inertia: Calculate Euclidean distance inertia of a given clustering
    • get_best_means: Return best means from a means list based on inertia
    • n_kmeans: Run k-means algorithm a given number of times and return the best clustering.
genbattle commented 7 years ago

I'll have another look tomorrow when I get some more time after work :-)

eozd commented 7 years ago

I have made the changes as you requested. One thing I was unsure of is what you've meant about distance function. Do you mean rewriting distance_squared completely so that now it returns the actual distance instead of its square? If not, can you clarify so that I change it?