luispedro / milk

MILK: Machine Learning Toolkit
http://www.luispedro.org/software/milk
MIT License
605 stars 146 forks source link

Added initial centroid parameter for kmeans #11

Closed mynameisfiber closed 11 years ago

mynameisfiber commented 11 years ago

There are many applications where specifying the initial centroids for a kmeans run is useful. This could be in cases where new initialization methods are being considered or when some iterative kmeans algorithm is desired (ie: xmeans)

This pull request adds the base functionality for this in milk.unsupervised.kmeans which should extend to all other related kmeans functions.

luispedro commented 11 years ago

Thanks. Merging...

mynameisfiber commented 11 years ago

@luispedro no worries! I was wondering, has the effort to openmp-ize the C++ functions stopped? I made some aspects of the kmeans algorithm be threaded and it seems to work nicely.

luispedro commented 11 years ago

Mostly because it is not so portable and can fail miserably if you try to run something else on the same machine, performance does not degrade gracefully, instead everyone just fights for CPU.

It just didn't feel appropriate for a library that may be used in many different environments...