Closed mhamilton723 closed 3 years ago
Hi, the current implementation already supports a limited version of minibatch kmeans, however you need to provide the full training data to the fit / fit_predict method, and fit_predict will randomly select a minibatch from the training data at each iteration of kmeans algorithm. In order to enable it, you should set minibatch parameter to the desired minibatch size, as shown here
If you can't provide full training data all at once, then yes it's possible to modify the current version to support that kind of minibatch kmeans as well.
if you want to run KMeans on GPU, you can take a look at TorchPQ, there are very fast and memory efficient implementations of KMeans and MinibatchKMeans algorithms.
Thank you for this pointer @DeMoriarty !
Is it possible to modify the existing implementation to support minibatch k means? Thanks